thr3ads.net - search: "amdgpupromotealloca"

Displaying 5 results from an estimated 5 matches for "amdgpupromotealloca".

AMDGPUPromoteAlloca assume 3-dims enabled?

2016 May 31

AMDGPUPromoteAlloca assume 3-dims enabled?

hi, list, I found AMDGPUPromoteAlloca calculates newly ptr as follows: std::tie(TCntY, TCntZ) = getLocalSizeYZ(Builder); Value *TIdX = getWorkitemID(Builder, 0); Value *TIdY = getWorkitemID(Builder, 1); Value *TIdZ = getWorkitemID(Builder, 2); Value *Tmp0 = Builder.CreateMul(TCntY, TCntZ, "", true, true); T...

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

...ret void } which cannot be handled by llc with the message "the non-hsa instrinsic with hsa target shown". After looking into the log (r259297), my question is that is there other intrinsic that support this case when the target is amdgcn--amdhsa? In the log of r259297, it states that AMDGPUPromoteAlloca pass (a backend pass) will generate this intrinsic, but even when I just emit-llvm without going through llc, this intrinsic is still emitted. [1] https://github.com/tstellarAMD/hsa-runtime Regards, 李弘宇 (Li, Hong-Yu) Department of Computer Science & Information Engineering National Taiwan U...

Re-numbering address space with a pass

2015 Nov 01

Re-numbering address space with a pass

Hi all, I would like my optimization pass to change an object's address space that is created by llvm.lifetime.start intrinsic. Because I want to be able to identify them later in a codegen pass. I can get a pointer from the intrinsic using CallInst::getArgOperand() function. However, I don't know what to do with it (or if it is the pointer that I want). How can I change its address space?

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

...with the message "the non-hsa instrinsic >> with hsa target shown". >> >> After looking into the log (r259297), my question is that is there other >> intrinsic that support this case when the target is amdgcn--amdhsa? In the >> log of r259297, it states that AMDGPUPromoteAlloca pass (a backend pass) >> will generate this intrinsic, but even when I just emit-llvm without going >> through llc, this intrinsic is still emitted. >> >> [1] https://github.com/tstellarAMD/hsa-runtime >> >> >> Regards, >> >> 李弘宇 (Li, Hong-Yu) &...

Extending SLP Vectorizer to deal with aggregates?

2015 Oct 14

Extending SLP Vectorizer to deal with aggregates?

I'm looking for a sanity check on extending SLP Vectorizer to deal with aggregates. I'd like to vectorize Julia tuple operations. The Julia compiler lowers tuples to LLVM arrays, not LLVM vectors. I've tried making Julia lower tuples to LLVM vectors, but that hurt performance when SLP Vectorizer was not applicable, because of extraction/insertion overhead. I.e., the Julia lowering

search for: amdgpupromotealloca