thr3ads.net - llvm dev - [LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer [May 2014]

If this information is useful, please help other people find it:
Share via:

Louis Gerbarg

2014-May-22 23:54 UTC

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

On May 22, 2014, at 3:51 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Thu, May 22, 2014 at 4:42 PM, Louis Gerbarg <lgg at apple.com>
wrote:
> The problem that the above transform is technically illegal because “When
indexing into a (optionally packed) structure, only i32 integer constants are
allowed (when using a vector of indices they must all be the same i32 integer
constant).” rule
<http://llvm.org/docs/LangRef.html#getelementptr-instruction>.
> 
> Wait, I don't follow. You don't violate this rule. The first index
in GEP is *not* into the structure, it is along the implicit array of objects
that any pointer refers to. Thus the i64 operand to the first GEP selects a
different struct object in an array of them, and the first i64 operand to the
second GEP re-selects the same struct object but then uses an i32 index into it.
> 
> Perhaps you need a better example to show the illegal transform? That would
help me understand the rest of your problem.

Hmm… you are correct, it turns out you are correct, I only have to worry if it
is constant after the second arg of the GEP. That should allow my case to
continue to work. I am currently building a stage2 now to see if works cleanly.

Having said that, there still exist cases where you are indexing into a
homogenous struct like so. Consider the following two functions, which are the
same except one is written using structs and the other is written using arrays:

%struct1 = type { i32, i32 }                                                    
%struct2 = type { %struct1, %struct1 }                                          
; Function Attrs: ssp uwtable                                                   
define i32 @test1(%struct2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {             
bb:                                                                             
  br i1 %tmp4, label %bb1, label %bb2                                           
bb1:                                              ; preds = %bb5                
  %tmp10 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 0               
  br label %bb3                                                                 
                                                                                
bb2:                                             ; preds = %.lr.ph.i.i          
  %tmp20 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 1              
  br label %bb3                                                                 
                                                                                
bb3:                                      ; preds = %bb2, %bb1                  
  %phi = phi %struct1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]                       
  %tmp24 = getelementptr inbounds %struct1* %phi, i64 0, i32 1                  
  %tmp25 = load i32* %tmp24, align 4                                            
  ret i32 %tmp25                                                                
}                                                                               
                                                                                
                                                                                
%array1 = type [2 x i32]                                                        
%array2 = type [2 x %array1]                                                    
                                                                                
; Function Attrs: ssp uwtable                                                   
define i32 @test2(%array2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {              
bb:                                                                             
  br i1 %tmp4, label %bb1, label %bb2                                           
bb1:                                              ; preds = %bb5                
  %tmp10 = getelementptr inbounds %array2* %dm, i64 %tmp9, i32 0                
  br label %bb3                                                                 
                                                                                
bb2:                                             ; preds = %.lr.ph.i.i          
  %tmp20 = getelementptr inbounds %array2* %dm, i64 %tmp19, i32 1               
  br label %bb3                                                                 
                                                                                
bb3:                                      ; preds = %bb2, %bb1                  
  %phi = phi %array1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]                        
  %tmp24 = getelementptr inbounds %array1* %phi, i64 0, i32 1                   
  %tmp25 = load i32* %tmp24, align 4                                            
  ret i32 %tmp25                                                                
}

The @test1 function cannot have the optimization applied because the %tmp10 and
%tmp20 GEPs vary by a field index into the struct, whereas @test2 can be
optimized to:

define i32 @test2([2 x [2 x i32]]* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
bb:
  br i1 %tmp4, label %bb1, label %bb2

bb1:                                              ; preds = %bb
  br label %bb3

bb2:                                              ; preds = %bb
  br label %bb3

bb3:                                              ; preds = %bb2, %bb1
  %0 = phi i32 [ 0, %bb1 ], [ 1, %bb2 ]
  %tmp24 = getelementptr inbounds [2 x [2 x i32]]* %dm, i64 %tmp9, i32 %0, i32 1
  %tmp25 = load i32* %tmp24, align 4
  ret i32 %tmp25
}

Louis


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140522/435eb1e2/attachment.html>

Chandler Carruth

2014-May-23 00:09 UTC

head link

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

On Thu, May 22, 2014 at 5:54 PM, Louis Gerbarg <lgg at apple.com> wrote:
> Hmm… you are correct, it turns out you are correct, I only have to worry
> if it is constant after the second arg of the GEP. That should allow my
> case to continue to work. I am currently building a stage2 now to see if
> works cleanly.
>
Cool. Hopefully we can make incremental progress on the problem while
figuring out the right thing to do in the fully general IR case...

>
> Having said that, there still exist cases where you are indexing into a
> homogenous struct like so. Consider the following two functions, which are
> the same except one is written using structs and the other is written using
> arrays:
>
Ah, delightful. I was pretty sure there was a more complex example that
still exhibited the problem, but wanted to see it to think through things
properly.

>
> %struct1 = type { i32, i32 }
>
> %struct2 = type { %struct1, %struct1 }
>
> ; Function Attrs: ssp uwtable
>
> define i32 @test1(%struct2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
>
> bb:
>
>   br i1 %tmp4, label %bb1, label %bb2
>
> bb1:                                              ; preds = %bb5
>
>   %tmp10 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 0
>
>   br label %bb3
>
>
>
> bb2:                                             ; preds = %.lr.ph.i.i
>
>   %tmp20 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 1
>
>   br label %bb3
>
>
>
> bb3:                                      ; preds = %bb2, %bb1
>
>   %phi = phi %struct1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]
>
>   %tmp24 = getelementptr inbounds %struct1* %phi, i64 0, i32 1
>
>   %tmp25 = load i32* %tmp24, align 4
>
>   ret i32 %tmp25
>
> }
>
>
>
>
>
> %array1 = type [2 x i32]
>
> %array2 = type [2 x %array1]
>
>
>
> ; Function Attrs: ssp uwtable
>
> define i32 @test2(%array2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
>
> bb:
>
>   br i1 %tmp4, label %bb1, label %bb2
>
> bb1:                                              ; preds = %bb5
>
>   %tmp10 = getelementptr inbounds %array2* %dm, i64 %tmp9, i32 0
>
>   br label %bb3
>
>
>
> bb2:                                             ; preds = %.lr.ph.i.i
>
>   %tmp20 = getelementptr inbounds %array2* %dm, i64 %tmp19, i32 1
>
>   br label %bb3
>
>
>
> bb3:                                      ; preds = %bb2, %bb1
>
>   %phi = phi %array1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]
>
>   %tmp24 = getelementptr inbounds %array1* %phi, i64 0, i32 1
>
>   %tmp25 = load i32* %tmp24, align 4
>
>   ret i32 %tmp25
>
> }
>
>
> The @test1 function cannot have the optimization applied because the
> %tmp10 and %tmp20 GEPs vary by a field index into the struct, whereas
> @test2 can be optimized to:
>
> define i32 @test2([2 x [2 x i32]]* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
> bb:
>   br i1 %tmp4, label %bb1, label %bb2
>
> bb1:                                              ; preds = %bb
>   br label %bb3
>
> bb2:                                              ; preds = %bb
>   br label %bb3
>
> bb3:                                              ; preds = %bb2, %bb1
>   %0 = phi i32 [ 0, %bb1 ], [ 1, %bb2 ]
>   %tmp24 = getelementptr inbounds [2 x [2 x i32]]* %dm, i64 %tmp9, i32 %0,
> i32 1
>   %tmp25 = load i32* %tmp24, align 4
>   ret i32 %tmp25
> }
>
>So, here is why relaxing this rule is hard:

Without a constant index, we don't know what sub-struct to use for
interpreting the next index in the event of heterogeneity. Without this, we
can't do anything, and so we definitionally preclude the transform.

While clearly we can make this transform safely for homogenous structs,
doing so seriously weakens the IR's guarantees. I'd not like to see us
go
that direction.

Instead, if this transformation is indeed important (and it sounds like it
is), I have a somewhat crazier idea: fold homogeneous struct type layers
into arrays. Thoughts? (I've not thought about this much, there might be
dragons here...)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140522/ea901ea2/attachment.html>

Hal Finkel

2014-May-23 02:19 UTC

head link

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Louis Gerbarg" <lgg at apple.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, May 22, 2014 7:09:49 PM
> Subject: Re: [LLVMdev] RFC: Indexing of structs vs arrays in
getelementpointer
> 
> 
> 
> 
> 
> 
> On Thu, May 22, 2014 at 5:54 PM, Louis Gerbarg < lgg at apple.com >
> wrote:
> 
> 
> 
> Hmm… you are correct, it turns out you are correct, I only have to
> worry if it is constant after the second arg of the GEP. That should
> allow my case to continue to work. I am currently building a stage2
> now to see if works cleanly.
> 
> 
> Cool. Hopefully we can make incremental progress on the problem while
> figuring out the right thing to do in the fully general IR case...
> 
> 
> 
> 
> 
> Having said that, there still exist cases where you are indexing into
> a homogenous struct like so. Consider the following two functions,
> which are the same except one is written using structs and the other
> is written using arrays:
> 
> 
> Ah, delightful. I was pretty sure there was a more complex example
> that still exhibited the problem, but wanted to see it to think
> through things properly.
> 
> 
> 
> 
> 
> 
> 
> %struct1 = type { i32, i32 }
> %struct2 = type { %struct1, %struct1 }
> ; Function Attrs: ssp uwtable
> define i32 @test1(%struct2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
> bb:
> br i1 %tmp4, label %bb1, label %bb2
> bb1: ; preds = %bb5
> %tmp10 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 0
> br label %bb3
> 
> bb2: ; preds = %.lr.ph.i.i
> %tmp20 = getelementptr inbounds %struct2* %dm, i64 %tmp9, i32 1
> br label %bb3
> 
> bb3: ; preds = %bb2, %bb1
> %phi = phi %struct1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]
> %tmp24 = getelementptr inbounds %struct1* %phi, i64 0, i32 1
> 
> %tmp25 = load i32* %tmp24, align 4
> ret i32 %tmp25
> }
> 
> 
> %array1 = type [2 x i32]
> %array2 = type [2 x %array1]
> 
> ; Function Attrs: ssp uwtable
> define i32 @test2(%array2* %dm, i1 %tmp4, i64 %tmp9, i64 %tmp19) {
> bb:
> br i1 %tmp4, label %bb1, label %bb2
> bb1: ; preds = %bb5
> %tmp10 = getelementptr inbounds %array2* %dm, i64 %tmp9, i32 0
> br label %bb3
> 
> bb2: ; preds = %.lr.ph.i.i
> %tmp20 = getelementptr inbounds %array2* %dm, i64 %tmp19, i32 1
> br label %bb3
> 
> bb3: ; preds = %bb2, %bb1
> %phi = phi %array1* [ %tmp10, %bb1 ], [ %tmp20, %bb2 ]
> %tmp24 = getelementptr inbounds %array1* %phi, i64 0, i32 1
> 
> %tmp25 = load i32* %tmp24, align 4
> ret i32 %tmp25
> }
> 
> 
> The @test1 function cannot have the optimization applied because the
> %tmp10 and %tmp20 GEPs vary by a field index into the struct,
> whereas @test2 can be optimized to:
> 
> 
> 
> 
> define i32 @test2([2 x [2 x i32]]* %dm, i1 %tmp4, i64 %tmp9, i64
> %tmp19) {
> bb:
> br i1 %tmp4, label %bb1, label %bb2
> 
> 
> bb1: ; preds = %bb
> br label %bb3
> 
> 
> bb2: ; preds = %bb
> br label %bb3
> 
> 
> bb3: ; preds = %bb2, %bb1
> %0 = phi i32 [ 0, %bb1 ], [ 1, %bb2 ]
> %tmp24 = getelementptr inbounds [2 x [2 x i32]]* %dm, i64 %tmp9, i32
> %0, i32 1
> 
> %tmp25 = load i32* %tmp24, align 4
> ret i32 %tmp25
> }
> 
> So, here is why relaxing this rule is hard:
> 
> 
> Without a constant index, we don't know what sub-struct to use for
> interpreting the next index in the event of heterogeneity. Without
> this, we can't do anything, and so we definitionally preclude the
> transform.
> 
> 
> While clearly we can make this transform safely for homogenous
> structs, doing so seriously weakens the IR's guarantees. I'd not
> like to see us go that direction.
To what guarantees are you referring?

Thanks again,
Hal
> 
> 
> Instead, if this transformation is indeed important (and it sounds
> like it is), I have a somewhat crazier idea: fold homogeneous struct
> type layers into arrays. Thoughts? (I've not thought about this
> much, there might be dragons here...)
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - May 2014 - [LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer