Chris Lattner
2004-Apr-04 20:57 UTC
[LLVMdev] Two important changes to the getelementptr instruction
Hi all, I just checked in a series of patches that makes some substantial changes to the LLVM getelementptr instruction. In particular, in LLVM 1.2 and earlier, the getelementptr instruction required index operands for structure types to be 'ubyte' constants and index operands for sequential types to be 'long' values. This had several problems, most notably that it was impossible to represent a structure with more than 256 fields, and performance for 32-bit targets suffered. Now in CVS, structure indices are required to be 'uint' constants, which permits structures with up to 2^32 elements. Sequential type indices are allowed to be any int, uint, long, and ulong values. This is documented in the LLVM language reference here: http://llvm.cs.uiuc.edu/docs/LangRef.html#i_getelementptr This change required modifying the encoding of getelementptr instruction in bytecode files, so LLVM 1.3 bytecode files won't be readable by LLVM 1.2 and earlier. LLVM .ll files and .bc files from earlier releases, as usual, will work fine with the newer version. In particular, we will always support .ll files that use ubyte indexes for structure types, as the parser auto-upgrades them to uint indexes. This change fixes the following bugs: http://llvm.cs.uiuc.edu/PR82 http://llvm.cs.uiuc.edu/PR309 It also results in cleaner and simpler LLVM code, allowing many integer casts that were present for array indexes to go away. Though the encoding of getelementptr instructions is slightly less efficient than before (because we have to store what type the various sequential type operands are), the reduction in cast instructions actually results in a bytecode size reduction of about 1%. If you have any questions about this change, please let me know, -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Reid Spencer
2004-Apr-04 23:09 UTC
[LLVMdev] Two important changes to the getelementptr instruction
Hi Chris, Congrats on getting this taken care of finally. I know its something you've wanted to do since 1.0. I have one question. How does LLVM disambiguate between a uint used for a structure and a uint used for an array? My assumption is that LLVM is aware of the type of the thing being indexed all the way through the dereference so it doesn't really matter what index type is being used. If that's the case, then why restrict structure indexes to uint and not make it "any integer type"? This might further reduce casting and generalizes the programming model. Mostly just curious ... Reid. On Sun, 2004-04-04 at 19:00, Chris Lattner wrote:> Hi all, > > I just checked in a series of patches that makes some substantial changes > to the LLVM getelementptr instruction. In particular, in LLVM 1.2 and > earlier, the getelementptr instruction required index operands for > structure types to be 'ubyte' constants and index operands for sequential > types to be 'long' values. This had several problems, most notably that > it was impossible to represent a structure with more than 256 fields, and > performance for 32-bit targets suffered. > > Now in CVS, structure indices are required to be 'uint' constants, which > permits structures with up to 2^32 elements. Sequential type indices are > allowed to be any int, uint, long, and ulong values. This is documented > in the LLVM language reference here: > http://llvm.cs.uiuc.edu/docs/LangRef.html#i_getelementptr > > This change required modifying the encoding of getelementptr instruction > in bytecode files, so LLVM 1.3 bytecode files won't be readable by LLVM > 1.2 and earlier. LLVM .ll files and .bc files from earlier releases, as > usual, will work fine with the newer version. In particular, we will > always support .ll files that use ubyte indexes for structure types, as > the parser auto-upgrades them to uint indexes. > > This change fixes the following bugs: > http://llvm.cs.uiuc.edu/PR82 > http://llvm.cs.uiuc.edu/PR309 > > It also results in cleaner and simpler LLVM code, allowing many integer > casts that were present for array indexes to go away. Though the encoding > of getelementptr instructions is slightly less efficient than before > (because we have to store what type the various sequential type operands > are), the reduction in cast instructions actually results in a bytecode > size reduction of about 1%. > > If you have any questions about this change, please let me know, > > -Chris_______________________ Reid Spencer President & CTO eXtensible Systems, Inc. rspencer at x10sys.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20040404/cafef1f3/attachment.sig>
Chris Lattner
2004-Apr-05 00:03 UTC
[LLVMdev] Two important changes to the getelementptr instruction
On Sun, 4 Apr 2004, Reid Spencer wrote:> Congrats on getting this taken care of finally. I know its something > you've wanted to do since 1.0.It certainly has been on the grand TODO list for a long time :)> I have one question. How does LLVM disambiguate between a uint used for > a structure and a uint used for an array?It depends on the operand number of the getelementptr instruction. For example: %P2 = getelementptr {int}* %P, uint 4 In the above example, we know that the uint is indexing "through" the pointer in the getelementptr instruction, so it is not an index into the structure. In this case: %P3 = getelementptr {int}* %P, uint 4, uint 0 The first index is just like the previous case, but the 'uint 0' is now indexing through the structure, so the number is a field number. The big distinction is that you can index with variables in arrays and pointers, but only constant integers in structures.> My assumption is that LLVM is aware of the type of the thing being > indexed all the way through the dereference so it doesn't really matter > what index type is being used.Very true.> If that's the case, then why restrict structure indexes to uint and not > make it "any integer type"? This might further reduce casting and > generalizes the programming model.The idea is that structure indices must be *constants*. As such, it doesn't really matter what type we make them, as long as there is enough "addressing space" to represent enough fields. Allowing structure fields to be "any type" is possible, but wouldn't give us anything: uint 7 means the same thing as sbyte 7. The disadvantage of allowing any integer type for the structure index is that we have to encode the integer type used in the bytecode file, which takes space. Right now the type of the index is implicitly encoded into the bytecode file based on what operand of the getelementptr it is. Another advantage is that clients of the LLVM IR can assume that all structure indexes are instances of ConstantUInt, which simplifies construction and modification of the IR a bit. -Chris> On Sun, 2004-04-04 at 19:00, Chris Lattner wrote: > > Hi all, > > > > I just checked in a series of patches that makes some substantial changes > > to the LLVM getelementptr instruction. In particular, in LLVM 1.2 and > > earlier, the getelementptr instruction required index operands for > > structure types to be 'ubyte' constants and index operands for sequential > > types to be 'long' values. This had several problems, most notably that > > it was impossible to represent a structure with more than 256 fields, and > > performance for 32-bit targets suffered. > > > > Now in CVS, structure indices are required to be 'uint' constants, which > > permits structures with up to 2^32 elements. Sequential type indices are > > allowed to be any int, uint, long, and ulong values. This is documented > > in the LLVM language reference here: > > http://llvm.cs.uiuc.edu/docs/LangRef.html#i_getelementptr > > > > This change required modifying the encoding of getelementptr instruction > > in bytecode files, so LLVM 1.3 bytecode files won't be readable by LLVM > > 1.2 and earlier. LLVM .ll files and .bc files from earlier releases, as > > usual, will work fine with the newer version. In particular, we will > > always support .ll files that use ubyte indexes for structure types, as > > the parser auto-upgrades them to uint indexes. > > > > This change fixes the following bugs: > > http://llvm.cs.uiuc.edu/PR82 > > http://llvm.cs.uiuc.edu/PR309 > > > > It also results in cleaner and simpler LLVM code, allowing many integer > > casts that were present for array indexes to go away. Though the encoding > > of getelementptr instructions is slightly less efficient than before > > (because we have to store what type the various sequential type operands > > are), the reduction in cast instructions actually results in a bytecode > > size reduction of about 1%. > > > > If you have any questions about this change, please let me know, > > > > -Chris > > > _______________________ > Reid Spencer > President & CTO > eXtensible Systems, Inc. > rspencer at x10sys.com >-Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Reasonably Related Threads
- [LLVMdev] Two important changes to the getelementptr instruction
- [LLVMdev] Two important changes to the getelementptr instruction
- [LLVMdev] question about GetElementPtr Instruction
- [LLVMdev] question about GetElementPtr Instruction
- [LLVMdev] getelementptr results in seg-fault.