Hi LLVM members, I have been working to make target independent memory layout for struct type(Aggregate type) in my team. I think that struct type has different memory layouts according to each target system in current LLVM. To implement target dependent concept for struct type, Frist, I have been implementing common type for struct type on bitcode at compilation time using llvm-gcc and then changing common type to target specific types at code generation time using llc (reconstruct StructLayout). Second, I have been adding two new intrinsic functions as following. 1. "getelement" intrinsic function to load from bitfield of struct type. 2. "setelement" intrinsic function to store to bitfied of struct type. I would like to how do LLVM developers think about above concept. (advices, problems, other solutions etc...) Thanks, Jin-Gu Kang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101019/193d0c81/attachment.html>
On 19 October 2010 07:57, Jin Gu Kang <jaykang10 at imrc.kist.re.kr> wrote:> Frist, I have been implementing common type for struct type on bitcode > at compilation time using llvm-gcc and then changing common type to target > specific types at code generation time using llc (reconstruct StructLayout).Hi Jin, Apart from bitfields and unions, the struct type is pretty much target agnostic. What kind of target-specific structure modifications do you have in mind?> Second, I have been adding two new intrinsic functions as following. > 1. "getelement" intrinsic function to load from bitfield of struct type. > 2. "setelement" intrinsic function to store to bitfied of struct type.Bitfields and unions can make the IR a big mess. Analysing the memory layout of the bitfield is not trivial and the low level bit fiddling you have to do is too big and ugly to be in IR. An instruction that knows that would be beneficial (for IR's sake). But there are other issues at hand, like the IR assuming certain casts are valid (ex. struct { double } -> char[8]) for a memcpy when you could have it differently for different targets. There was a discussion a few weeks ago about unions, you should read it to understand the problems involved. Bear in mind that whatever change you make in the generic part of the IR, you have to implement (or find people that will) in *every* target, otherwise we can never get rid of the old implementation and yours will not "catch". That's exactly what happened to the previous union type and will happen again if it's not done all the way through. cheers, --renato
Hi Jin Gu Kang,> I have been working to make target independent memory layout for struct > type(Aggregate type) in my team. > I think that struct type has different memory layouts according to each target > system in current LLVM.yes this is true, because alignment depends on the target. You may want to look at packed structs where no alignment padding is added between elements. This does not completely fix the problem, because non-aggregate types can have a size that depends on their alignment, for example x86 long double occupies 12 bytes in a packed struct on x86-32 linux but 16 bytes on x86-32 darwin. It used to be that in packed structs the amount of space was the minimum possible (10 bytes for an x86 long double), but this was removed because it was considered too confusing. However it could probably be restored if really needed.> To implement target dependent concept for struct type, > Frist, I have been implementing common type for struct type on bitcode at > compilation time using llvm-gcc and then changing common type to target specific > types at code generation time using llc (reconstruct StructLayout). > Second, I have been adding two new intrinsic functions as following. > 1. "getelement" intrinsic function to load from bitfield of struct type. > 2. "setelement" intrinsic function to store to bitfied of struct type. > I would like to how do LLVM developers think about above concept. > (advices, problems, other solutions etc...)I didn't understand this paragraph. Perhaps you could explain with a simple example? Best wishes, Duncan.
Hi Renato, Firstly, I have been removing target specific information from struct type on bitcode. Target specific information are type size, type alignment, merged bitfields and so on. For example 1 struct test { 2 char a:3; 3 char b:4; 4 char c:3; 5 char d:2; 6 }; 7 8 struct test vm = {1, 2, 3, 1}; 9 10 int main(void) 11 { 12 int a; 13 vm.d = 1; 14 } Above source code is compiled using cross arm-llvm-gcc as following. 1 ; ModuleID = './temp2.c' 2 target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64" 3 target triple = "armv5-none-linux-gnueabi" 4 5 %0 = type { i8, i8 } 6 %struct.test = type <{ i8, i8 }> 7 8 @vm = global %0 { i8 17, i8 11 } ; <%0*> [#uses=1] 9 10 define arm_aapcscc i32 @main() nounwind { 11 entry: 12 %retval = alloca i32 ; <i32*> [#uses=1] 13 %a = alloca i32 ; <i32*> [#uses=0] 14 %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] 15 %0 = load i8* getelementptr inbounds (%struct.test* bitcast (%0* @vm to %struct.test*), i32 0, i32 1), align 1 ; <i8> [#uses=1] 16 %1 = and i8 %0, -25 ; <i8> [#uses=1] 17 %2 = or i8 %1, 8 ; <i8> [#uses=1] 18 store i8 %2, i8* getelementptr inbounds (%struct.test* bitcast (%0* @vm to %struct.test*), i32 0, i32 1), align 1 19 br label %return 20 21 return: ; preds = %entry 22 %retval1 = load i32* %retval ; <i32> [#uses=1] 23 ret i32 %retval1 24 } In line5-6, bitfields for struct type is merged. Merging bitfields is affected by type alignment and type size. (type alignment and type size are target dependent information.) I have been convert above bitcode to more target independent bitcode as following. 1 ; ModuleID = './temp2.c' 2 target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64" 3 target triple = "armv5-none-linux-gnueabi" 4 5 %0 = type { i3, i4, i3, i2 } 6 %struct.test = type <{ i3, i4, i3, i2 }> 7 8 @vm = global %0 { i3 1, i4 2, i3 3, i2 1 } ; <%0*> [#uses=1] 9 10 define arm_aapcscc i32 @main() nounwind { 11 entry: 12 %retval = alloca i32 ; <i32*> [#uses=1] 13 %a = alloca i32 ; <i32*> [#uses=0] 14 %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] 15 call void @llvm.setelement.i2(i8* bitcast (i4* getelementptr (%struct.test* bitcast (%0* @vm to %struct.test*), i32 1, i32 1) to i8*), i2 1) 16 br label %return 17 18 return: ; preds = %entry 19 %retval1 = load i32* %retval ; <i32> [#uses=1] 20 ret i32 %retval1 21 } 22 23 declare void @llvm.setelement.i2(i8*, i2) nounwind I have been trying to maintain shape of struct type within original source code on bitcode. This type become concrete according to each target on llc pass and above "setelement" intrinsic function is also concrete on llc pass. (Until now, I have been considering compilation pass from llvm-gcc to llc. Currently, I am working on llc pass.) As you can see above, I have been trying to make more target independent struct type, not completely. Secondly, I also finded union type's target dependent problems through long double type on core-utils package. File isnan.c 28 # define DOUBLE long double ... 66 #define NWORDS \ 67 ((sizeof (DOUBLE) + sizeof (unsigned int) - 1) / sizeof (unsigned int)) 68 typedef union { DOUBLE value; unsigned int word[NWORDS]; } memory_double; Because floating point data type isn't allowed to have bitwise operation, above isnan function converts long double type to unsigned int array whether or not to know "not a number". I think that both above statements and refered union types problems on your answer are source codes which don't consider portability for multi-platform. I have been focusing on target independent bitcode about source codes which consider portability for multi-platform, not above statements. (Target dependent memory layout for struct type is generated on source codes which consider portability for multi-platform,) My subject in my team makes more target independent bitcode, not completely. In addition to memory layout for struct type, I have been considering several target dependent inforamtion on current bitcode. If I have a chance, I would like to discuss the information with llvm developers. :) I really appreciate your answer. Thanks, Jin-Gu Kang
Hi renato, First, I really appreciate your answer. :) The IR in an previous e-mail is incomplete so far and I am converting it to various shape. My team members decided to add new types to solve the bitfield's alignment problem. Let's consider your previous examples: struct testChar { char a:3; char b:4; char c:3; char d:2; }; struct testShort { short a:3; short b:4; short c:3; short d:2; }; struct testInt { int a:3; int b:4; int c:3; int d:2; }; In future, our IRs will be represented as following: %Char = type { c3, c4, c3, c2 } %Short = type { s3, s4, s3, s2 } %Int = type { i3, i4, i3, i2 } This was just concept and I didn't implement it yet. I usually try to modify original llvm system as less as possible, so I prefer adding new types to modifying original llvm types. I will report your opinion to my team members and we will discuss that. Thanks for your answer, Jin-Gu Kang