thr3ads.net - similar to: "[LLVMdev] Alias in LLVM 3.0"

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] Alias in LLVM 3.0"

2012 Feb 28

[LLVMdev] Alias in LLVM 3.0

Hi Richard, > In LLVM 2.9 and LLVM 3.0, our front-end generates: > > @__shuffle_2f32_2u32 = alias weak <2 x i32> (<2 x i32>, <2 x i32>)* @4 > > And the calls, before linking, look like: > > %call9 = call <2 x float> @__shuffle_2f32_2u32(<2 x float> %tmp7, <2 x i32> > %tmp8) nounwind I don't see how this is possible - it should be

[LLVMdev] A potential bug

2011 Oct 06

[LLVMdev] A potential bug

Hi all, There might be a bug in DeadStoreElimination.cpp. This pass eliminates stores backwards aggressively in an end BB. It does not check dependencies on stores in an end BB though. For example, in this code snippet: ... 1. %sum.safe_r47.pre-phi = phi i64* [ %sum.safe_r47.pre, %entry.for.end_crit_edge ], [ %sum.safe_r42, %for.body ] 2. %call9 = call i32 @gettimeofday(%struct.timeval* %end,

[LLVMdev] A potential bug

2011 Oct 06

[LLVMdev] A potential bug

On Thu, Oct 6, 2011 at 2:20 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Thu, Oct 6, 2011 at 2:12 PM, Zeng Bin <ezengbin at gmail.com> wrote: >> Hi all, >> >> There might be a bug in DeadStoreElimination.cpp. This pass eliminates >> stores backwards aggressively in an end BB. It does not check dependencies >> on stores in an end BB though.

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Nov 09

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Dear all, I'm attaching a patch that should fix the issue mentioned above. It simply makes the same check seen in the same file for global variables: emitPTXAddressSpace(PTy->getAddressSpace(), O); if (GVar->getAlignment() == 0) O << " .align " << (int) TD->getPrefTypeAlignment(ETy); else O << " .align " <<

[LLVMdev] A potential bug

2011 Oct 06

[LLVMdev] A potential bug

On Thu, Oct 6, 2011 at 2:12 PM, Zeng Bin <ezengbin at gmail.com> wrote: > Hi all, > > There might be a bug in DeadStoreElimination.cpp. This pass eliminates > stores backwards aggressively in an end BB. It does not check dependencies > on stores in an end BB though. For example, in this code snippet: > ... > 1. %sum.safe_r47.pre-phi = phi i64* [ %sum.safe_r47.pre,

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Jul 11

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Hello, FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324 When compiling the following code for sm_20, func params are by some reason given with .align 0, which is invalid. Problem does not occur if compiled for sm_10. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple =

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

2011 Oct 20

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

Hello, I am the developer of Clover, and so much activity about OpenCL these days is really exciting. Here is my point of view, mainly on Clover and how the projects could use each other. Clover is made in a way that allow a certain level of modularity. Although POCL would be very difficult to merge into Clover (or Clover into POCL), as these two projects are nearly exactly doing the same things

[LLVMdev] Functions: sret and readnone

2009 Nov 05

[LLVMdev] Functions: sret and readnone

It's been a while and I finally had the time to look into this. What I did was to build a custom AliasAnalysis pass, as Chris suggested, that returns AliasAnalysis::Mod for values passed to the sample function in the sret spot, and NoModRef for all other values. I'm also returning AliasAnalysis::AccessesArguments in the pass' getModRefBehavior methods. However, I haven't been

[LLVMdev] A potential bug

2011 Oct 06

[LLVMdev] A potential bug

It does not do anything. It is an abstract function which transforms a pointer and returns another pointer of the same type. It does not visit memory or capture the pointer parameter. On Thu, Oct 6, 2011 at 2:22 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Thu, Oct 6, 2011 at 2:20 PM, Eli Friedman <eli.friedman at gmail.com> > wrote: > > On Thu, Oct 6, 2011 at

[LLVMdev] A potential bug

2011 Oct 06

[LLVMdev] A potential bug

If int_guard_load returns a pointer based on the passed-in pointer, it captures it (at least according to the definition of "capture" which NoCapture uses). -Eli On Thu, Oct 6, 2011 at 2:26 PM, Zeng Bin <ezengbin at gmail.com> wrote: > It does not do anything. It is an abstract function which transforms a > pointer and returns another pointer of the same type. It does not

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple

OpenCL toolset (for AMD GPU)

2015 Sep 29

OpenCL toolset (for AMD GPU)

Hi LLVM, I would like to compile OpenCL kernel for a specific AMD GPU target. Is it possible with the current clang/LLVM? I started by using `clang -x cl` but it looks like at least some OpenCL specific headers are missing (e.g. uint2 is not recognized as a type). Any links to documentation / tutorials very welcome. Thanks. - Paweł -------------- next part -------------- An HTML attachment was

[LLVMdev] Missing Optimization Opportunities

2010 Sep 10

[LLVMdev] Missing Optimization Opportunities

Hi, I'm using LLVM 2.7 right now, and I found "opt -std-compile-opts" has missed some opportunities for optimization: define void @spa.main() readonly { entry: %tmp = load i32* @dst-ip ; <i32> [#uses=3] %tmp1 = and i32 %tmp, -16777216 ; <i32> [#uses=1] %tmp2 = icmp eq i32 %tmp1, 167772160 ; <i1> [#uses=2]

[LLVMdev] Functions: sret and readnone

2009 Oct 05

[LLVMdev] Functions: sret and readnone

Hi all, I'm currently building a DSL for a computer graphics project that is not unlike NVIDIA's Cg. I have an intrinsic with the following signature float4 sample(texture tex, float2 coords); that is translated to this LLVM IR code: declare void @"sample"(%float4* noalias nocapture sret, %texture, $float2) nounwind readnone The type float4 is basically an array of four

[LLVMdev] ARM backend problem ?

2007 Jun 12

[LLVMdev] ARM backend problem ?

Hello, I want to compile a LLVM file into an executable running on ARM platform. I use LLVM 2.0 with the following command lines: llvm-as -f -o test.bc test.ll llc -march=arm -mcpu=arm1136j-s -mattr=+v6 -f -o test.s test.bc arm-linux-gnu-as -mcpu=arm1136j-s test.s With the last command, I obtain the following error: rd and rm should be different in mul The bad instruction is

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

2015 Jun 11

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

[+Arnold] > On Jun 10, 2015, at 1:29 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > [+CC Andy] > >> Can anyone familiar with ScalarRevolution tell me whether this is an >> expected behavior or a bug? > > Assuming you're talking about 2*k, this is a bug. ScalarEvolution > should be able to prove that {0,+,4} is <nsw> and

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

2013 Oct 27

[LLVMdev] Missed optimization opportunity with piecewise load shift-or'd together?

The following piece of IR is a fixed point for opt -std-compile-opts/-O3: --- target datalayout = "e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: nounwind readonly define i32 @get32Bits(i8*

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

2015 Jun 10

[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr

I am testing vectorization on the following test case: float x[1024], y[1024]; void myloop1() { for (long int k = 0; k < 512; k++) { x[2*k] = x[2*k]+y[k]; } } Vectorization failed due to "unsafe dependent memory operation". I traced the LoopAccessAnalysis.cpp and found the reason is the NoWrapFlag for SCEVAddRecExpr is not set and consequently the

[LLVMdev] Functions: sret and readnone

2009 Oct 05

[LLVMdev] Functions: sret and readnone

On Oct 5, 2009, at 7:21 AM, Stephan Reiter wrote: > Hi all, > > I'm currently building a DSL for a computer graphics project that is > not unlike NVIDIA's Cg. I have an intrinsic with the following > signature > > float4 sample(texture tex, float2 coords); > > that is translated to this LLVM IR code: > > declare void @"sample"(%float4* noalias

[LLVMdev] change type allocoted register

2010 Jan 04

[LLVMdev] change type allocoted register

Hi; i am using llvm backend on x86 arch. My app ABI requires float2 (v2f32) to be passes as parameter and return in XMM0 register. Currently LLVM handles v2f32 using MMX register MM0. i wonder what changes do i need to do in LLVM to support that change; manipulating v2f32 (float2) using XMM and not MMX ? one place i identifies where a change needs to be done is X86CallingConv.td where it

similar to: [LLVMdev] Alias in LLVM 3.0