Displaying 13 results from an estimated 13 matches for "a_ptr".
2017 Aug 04
3
[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model
...he inline cost model, but I'm unsure
how to proceed. Let me begin by first describing the problem I'm trying
to solve. Consider the following pseudo C code:
*typedef struct element {
unsigned idx;
} element_t;
*
*static inline
unsigned char fn2 (element_t *dst_ptr, const element_t *a_ptr,
const element_t *b_ptr, unsigned char changed) {
if (a_ptr && b_ptr && a_ptr->idx == b_ptr->idx) {
if (!changed && dst_ptr && dst_ptr->idx == a_ptr->idx) {
/* Do something */
} else {
changed = 1;
if...
2017 Aug 07
3
[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model
...2 and only one of them has the property that the 1st and 2nd argument are the same (as is shown in my pseudo code). Internally, we have another developer, Matt Simpson, working on a function specialization patch that might be of value here. Specifically, you could clone fn2 based on the fact that a_ptr == dst_ptr and then simplify a great deal of the function. However, that patch is still a WIP.
GCC will do IPA-CP/const-prop with cloning, and i'm wildly curious if new GCC's catch this case for you at higher optimization levels?
GCC does inline fn2 into fn1 in this particular case, but...
2017 Aug 04
4
[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model
...and only one of them has the property that the 1st and 2nd
argument are the same (as is shown in my pseudo code). Internally, we
have another developer, Matt Simpson, working on a function
specialization patch that might be of value here. Specifically, you
could clone fn2 based on the fact that a_ptr == dst_ptr and then
simplify a great deal of the function. However, that patch is still a WIP.
> GCC will do IPA-CP/const-prop with cloning, and i'm wildly curious if
> new GCC's catch this case for you at higher optimization levels?
GCC does inline fn2 into fn1 in this particula...
2011 Aug 11
2
[LLVMdev] Pass a struct on windows
...ddr, i8*, i64) nounwind {
entry:
%agg_addr.0 = getelementptr inbounds %0* %agg_addr, i32 0, i32 0
store i8* %0, i8** %agg_addr.0, align 8
%agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
store i64 %1, i64* %agg_addr.1, align 8
ret void
}
define i64 @call_by_pointer() {
%a_ptr = alloca float, align 4
%a_opq = bitcast float* %a_ptr to i8*
%agg_addr = alloca %0, align 8
call void @foo_by_pointer(%0* %agg_addr, i8* %a_opq, i64 4)
%agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
%result.1 = load i64* %agg_addr.1, align 8
ret i64 %result.1
}
(2) P...
2017 Aug 07
2
[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model
...and 2nd argument are the same (as is shown in my
> pseudo code). Internally, we have another developer, Matt
> Simpson, working on a function specialization patch that might
> be of value here. Specifically, you could clone fn2 based on
> the fact that a_ptr == dst_ptr and then simplify a great deal
> of the function. However, that patch is still a WIP.
>
>
> GCC will do IPA-CP/const-prop with cloning, and i'm wildly
> curious if new GCC's catch this case for you at higher
> optimiz...
2011 Aug 11
0
[LLVMdev] Pass a struct on windows
....0 = getelementptr inbounds %0* %agg_addr, i32 0, i32 0
> store i8* %0, i8** %agg_addr.0, align 8
> %agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
> store i64 %1, i64* %agg_addr.1, align 8
> ret void
> }
>
> define i64 @call_by_pointer() {
> %a_ptr = alloca float, align 4
> %a_opq = bitcast float* %a_ptr to i8*
> %agg_addr = alloca %0, align 8
> call void @foo_by_pointer(%0* %agg_addr, i8* %a_opq, i64 4)
> %agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
> %result.1 = load i64* %agg_addr.1, align...
2011 Aug 11
1
[LLVMdev] Pass a struct on windows
...r, i32 0, i32 0
> > store i8* %0, i8** %agg_addr.0, align 8
> > %agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
> > store i64 %1, i64* %agg_addr.1, align 8
> > ret void
> > }
> >
> > define i64 @call_by_pointer() {
> > %a_ptr = alloca float, align 4
> > %a_opq = bitcast float* %a_ptr to i8*
> > %agg_addr = alloca %0, align 8
> > call void @foo_by_pointer(%0* %agg_addr, i8* %a_opq, i64 4)
> > %agg_addr.1 = getelementptr inbounds %0* %agg_addr, i32 0, i32 1
> > %result.1 = load...
2011 Aug 15
3
[LLVMdev] structured types as function arguments
Hi,
When calling a function, does the llvm code generator support passing structured types (arrays, structs, etc.) by _value_? I wrote some small examples, and it seemed
to work, but I was wondering if anything can go wrong if the structured types are very large...
Thanks,
N
2012 Jul 21
3
Use GPU in R with .Call
...*************************************/
/* "VecAdd_cuda.c" adds two double vectors using GPU. */
/************************************************/
extern void vecAdd_kernel(double *ain,double *bin,double *cout,int len);
SEXP VecAdd_cuda(SEXP a,SEXP b) {
int len;
double *a_ptr,*b_ptr,*resout_ptr;
/*Digest R objects*/
len=length(a);
a_ptr=REAL(a);
b_ptr=REAL(b);
SEXP resout;
PROTECT(resout=allocVector(REALSXP,len));
resout_ptr=REAL(resout);
vecAdd_kernel(a_ptr,b_ptr,resout_ptr,len);
UNPROTECT(1);
return resout;
}
(b) Next, the host function and the k...
2012 Jun 30
2
[LLVMdev] llc -O# / opt -O# differences
...different. Here's a silly unoptimized bit of code which I'm generating
from my LLVM-backed program
; ModuleID = 'foo'
%Coord = type { double, double, double }
define double @foo(%Coord*, %Coord*) nounwind uwtable ssp {
entry:
%dx_ptr = alloca double
%b_ptr = alloca %Coord*
%a_ptr = alloca %Coord*
store %Coord* %0, %Coord** %a_ptr
store %Coord* %1, %Coord** %b_ptr
%a = load %Coord** %a_ptr
%addr = getelementptr %Coord* %a, i64 0
%2 = getelementptr inbounds %Coord* %addr, i32 0, i32 0
%3 = load double* %2
%b = load %Coord** %b_ptr
%addr1 = getelementptr %Coord...
2013 Oct 23
0
[LLVMdev] First attempt at recognizing pointer reduction
...rizontal reduction i mention above is not needed if you have no use inside the loop like in the case of:
r=0
for (i = …) {
r += a[i];
}
return r;
This is simply (i am leaving out the induction variable for “i”):
%r_red = phi <2 x i32> [preheader, <%r, 0>] , [loop, %r_red_next]
%a_ptr = gep %a, %i
%val_of_a_sub_i = load <2 x i32> * %a_ptr
%r_red_next = add %r_red, %val_of_a_sub_i
Outside of the loop we reduce the vectorized reduction to the final value of “r”
%r= horizontal_add %r_red_next
ret %r
> In my example, I'm aggregating a computation in an array...
2013 Oct 23
2
[LLVMdev] First attempt at recognizing pointer reduction
On 23 October 2013 16:05, Arnold Schwaighofer <aschwaighofer at apple.com>wrote:
> In the examples you gave there are no reduction variables in the loop
> vectorizer’s sense. But, they all have memory accesses that are strided.
>
This is what I don't get. As far as I understood, a reduction variable is
the one that aggregates the computation done by the loop, and is used
2014 Oct 03
2
[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)
...AG: SUB_INT
> -; EG-DAG: ADD_INT
> +; EG-DAG: SUB_INT {{[* ]*}}[[HI]]
> +; EG-NOT: SUB
> define void @v_sub_i64(i64 addrspace(1)* noalias %out, i64 addrspace(1)* noalias %inA, i64 addrspace(1)* noalias %inB) nounwind {
> %tid = call i32 @llvm.r600.read.tidig.x() readnone
> %a_ptr = getelementptr i64 addrspace(1)* %inA, i32 %tid
> diff --git a/test/CodeGen/R600/uaddo.ll b/test/CodeGen/R600/uaddo.ll
> index 0b854b5..ce30bbc 100644
> --- a/test/CodeGen/R600/uaddo.ll
> +++ b/test/CodeGen/R600/uaddo.ll
> @@ -1,5 +1,5 @@
> ; RUN: llc -march=r600 -mcpu=SI -verif...