On Tue, 06 May 2008 16:06:35 -0400, Gordon Henriksen wrote:> On 2008-05-06, at 13:42, Hendrik Boom wrote: > >> One more question. I hope you're not getting tired of me already. Does >> generating LLVM code have to proceed in any particular order? >> >> Of course, if I am writing LLVM assembler by appending characters to >> the >> end of a sequential file, I'd have to write everything in the order >> prescribed by the assembler syntax. >> >> But if I'm using the C interface to build an LLVM parse tree, does that >> have to be in any particular time-order? Can I, for example, define a >> few functions, start scattering code into them, decide I' like to >> declare >> some more local variables in one of them, generate code for another, >> return to the first one and stick in a new basic block at its start, >> discover I should have declared some more global variables, and so >> forth? >> >> That could be very convenient. > > > Yes, you can absolutely do this. > > — GordonI think I may have found an exception to this -- the API seems to require me to have all the fields for a struct ready before I construct the struct. I don't have the ability to make a struct type, use it to declare some variables, and still contribute fields to it during the rest of the compilation. Is there a reason for this limitation other than no one thinking of it? Does it need to have all the type information early in building the parser tree? I can't really imagine that. I for one could do without this limitation. I won't even ask to be able to contribute more fields at link time, though that would be useful, too. Such link-time-assembled structures ould resemble the DXD dummy control sections sections that PL/1 used on OS/360. -- hendrik
On Jun 12, 2008, at 11:38, Hendrik Boom wrote:> On Tue, 06 May 2008 16:06:35 -0400, Gordon Henriksen wrote: > >> On 2008-05-06, at 13:42, Hendrik Boom wrote: >> >>> One more question. I hope you're not getting tired of me already. >>> Does generating LLVM code have to proceed in any particular order? >>> >>> Of course, if I am writing LLVM assembler by appending characters >>> to the end of a sequential file, I'd have to write everything in >>> the order prescribed by the assembler syntax. >>> >>> But if I'm using the C interface to build an LLVM parse tree, does >>> that have to be in any particular time-order? Can I, for example, >>> define a few functions, start scattering code into them, decide I' >>> like to declare some more local variables in one of them, generate >>> code for another, return to the first one and stick in a new basic >>> block at its start, discover I should have declared some more >>> global variables, and so forth? >>> >>> That could be very convenient. >> >> Yes, you can absolutely do this. > > I think I may have found an exception to this -- the API seems to > require me to have all the fields for a struct ready before I > construct the struct. I don't have the ability to make a struct > type, use it to declare some variables, and still contribute fields > to it during the rest of the compilation. > > Is there a reason for this limitation other than no one thinking of > it? Does it need to have all the type information early in building > the parser tree? I can't really imagine that. I for one could do > without this limitation.You really can't do this since LLVM types are shape isomorphic. Observe what happens to the types of @x and @y: gordon$ cat input.ll %xty = type {i32} %yty = type {i32} @x = external constant %xty @y = external constant %yty gordon$ llvm-as < input.ll | llvm-dis ; ModuleID = '<stdin>' %xty = type { i32 } %yty = type { i32 } @x = external constant %xty ; <%xty*> [#uses=0] @y = external constant %xty ; <%xty*> [#uses=0] (This is not a side-effect of llvm-as or llvm-dis, but a fundamental property of the LLVM 'Type' class.) The only type that is not shape-isomorphic is 'opaque'. Each mention of 'opaque' in LLVM IR is a distinct type: gordon$ cat input2.ll %xty = type opaque %yty = type opaque @x = external constant %xty @y = external constant %yty gordon$ llvm-as < input2.ll | llvm-dis ; ModuleID = '<stdin>' %xty = type opaque %yty = type opaque @x = external constant %xty ; <%xty*> [#uses=0] @y = external constant %yty ; <%yty*> [#uses=0]> I won't even ask to be able to contribute more fields at link time, > though that would be useful, too. Such link-time-assembled > structures ould resemble the DXD dummy control sections sections > that PL/1 used on OS/360.This is absolutely possible: @Type.field.offs = external constant i32 ... %Type.field.offs = load i32* @Type.field.offs %obj.start = bitcast %object* %obj to i8* %obj.field = getelementptr i8* %obj.start, i32 0, i32 %Type.field.offs %field.ptr = bitcast %obj.field to %field* %field.val = load %field* %field.ptr This is completely analogous to opaque data types in C. You can use any of the following techniques: typedef struct OpaqueFoo *FooRef; /* like %object = type opaque in LLVM */ typedef void *FooRef; /* like %object = type i8 in LLVM */ typedef struct { struct Vtable *VT; } Base; typedef Base *FooRef; /* like %object = type { %vtable* } in LLVM */ — Gordon
>> >> I think I may have found an exception to this -- the API seems to >> require me to have all the fields for a struct ready before I >> construct the struct. I don't have the ability to make a struct >> type, use it to declare some variables, and still contribute fields >> to it during the rest of the compilation. >> >> Is there a reason for this limitation other than no one thinking of >> it? Does it need to have all the type information early in building >> the parser tree? I can't really imagine that. I for one could do >> without this limitation. > > You really can't do this since LLVM types are shape isomorphic. > Observe what happens to the types of @x and @y: > > gordon$ cat input.ll > %xty = type {i32} > %yty = type {i32} > @x = external constant %xty > @y = external constant %yty > > gordon$ llvm-as < input.ll | llvm-dis > ; ModuleID = '<stdin>' > %xty = type { i32 } > %yty = type { i32 } > @x = external constant %xty ; <%xty*> [#uses=0] > @y = external constant %xty ; <%xty*> [#uses=0] > > (This is not a side-effect of llvm-as or llvm-dis, but a fundamental > property of the LLVM 'Type' class.) > > The only type that is not shape-isomorphic is 'opaque'. Each mention > of 'opaque' in LLVM IR is a distinct type: > > gordon$ cat input2.ll > %xty = type opaque > %yty = type opaque > @x = external constant %xty > @y = external constant %yty > > gordon$ llvm-as < input2.ll | llvm-dis > ; ModuleID = '<stdin>' > %xty = type opaque > %yty = type opaque > @x = external constant %xty ; <%xty*> [#uses=0] > @y = external constant %yty ; <%yty*> [#uses=0] >So it appears that types are processed for identity the moment they are made during parse tree construction? This means that a type has to be completely known on creation. Presumably there's some mechanism tor a type that isn't completely known yet -- or is thet avoided by having a type 'pointer' instead of 'poimter-to-foo'? -- hendrik