On Fri, 12 Sep 2008 11:06:30 -0700, Eli Friedman wrote:> On Fri, Sep 12, 2008 at 9:35 AM, Hendrik Boom <hendrik at topoi.pooq.com> > wrote: >> I'd like to be able to make use of a structure type and its fields >> before it is completely defined. To be specific, let me ask detailed >> questions at various stages in the construction of a recursive type. I >> copy from >> >> http://llvm.org/docs/ProgrammersManual.html#TypeResolve >> >> // Create the initial outer struct >> PATypeHolder StructTy = OpaqueType::get(); >> >> Is it possible to declare variables of type StructTy at this point? > > I think you can, although you have to be careful; if you don't make sure > the variable eventually has a computable size, the module won't be > valid.Of course, eventually they type will ba fully defined.> > Declaring variables of type pointer to StructTy is completely safe. > >> std::vector<const Type*> Elts; >> Elts.push_back(PointerType::get(StructTy)); >> >> Is it possible to build an expression that uses the newly generated Elt >> as field-selector at this point? I'm hoping yes, but I suspect No, >> because the elments of Elts* are clearly Type* instead of being a >> field. In particular, if I use the same type twice to make two fields, >> the corresponding elements of Elts will be indistinguishable. > > I'm not following; are you trying to access the first member of NewSTy > here? You can't use a type that hasn't been created yet. You might be > able to pull some tricks with incomplete types or casts, though.What I want is to be able to use the fields that have already been defined, even though the type isn't complete yet. The vector<const Type*> is all I have at that moment, and it isn't a type. But by the time I have a type it's frozen and I can't add new fields to it. Do I gather that I keep making new types, each slightly larger than the previous ones, cast each pointer to my growing type to the type-of-the- moment, and field-select from it; then finally complete the type when all is known? That might just work, if field-allocation is independent of later fields, but it is ugly. The trouble is that llvm won't believe in fields until the structure is complete, and then it believes in all of them. While that's fine for semantics of the completed module, it makes less sense while the module is under construction. I view type-declaration syntax as being syntax, and I'd like it to be as flexible and modifiable as syntax anywhere else in the parse tree. The time to interpret type declarations as actually defining specific types with known semantics is after the syntax has been constructed, not before. If it's possible to do some of it statically during parse tree construction, that's fine, but it shouldn't be *required*. But it's evidently not the way llvm thinks.> > http://llvm.org/docs/LangRef.html#i_getelementptr and > http://llvm.org/docs/GetElementPtr.html might be useful here.These operations also require a completed tyle.> >> Elts.push_back(Type::Int32Ty); >> StructType *NewSTy = StructType::get(Elts); >> >> Presumably at this point is is definitely possible to declare variables >> of type NewsTy and use field-selectors from NewSTy. But it's a little >> too late for my purposes. > > Basically, the rule for opaque types is that in a valid module, you can > do anything you could do with a declaration like "struct S;" in C.Which is, basically, nothing but point to it and statically know that it's the same or different from (possibly) other types.> And in a module under construction, I'm pretty sure you can pull some > more tricks, like declaring variables with types of unknown size, or > accessing structs with members of unknown size.But not their fields, because llvm doesn't believe in fields of a structure until it has them all. If the elements of Elts were fields of a yet-to-be-identified structure, I'd be able to use them; it would make sense then to use field explicitly in the getelementptr istruction instead of the integers which have to be indexed into a type to obtain them.> > -Eli
On Sat, Sep 13, 2008 at 11:09 AM, Hendrik Boom <hendrik at topoi.pooq.com> wrote:> What I want is to be able to use the fields that have already been > defined, even though the type isn't complete yet. The vector<const > Type*> is all I have at that moment, and it isn't a type. But by the > time I have a type it's frozen and I can't add new fields to it. > > Do I gather that I keep making new types, each slightly larger than the > previous ones, cast each pointer to my growing type to the type-of-the- > moment, and field-select from it; then finally complete the type when > all is known? That might just work, if field-allocation is independent > of later fields, but it is ugly.Field-allocation is guaranteed to be independent of later fields, so the casting solution would work. It might be slightly cleaner to define the types recursively... for example, define a struct as { i32 { float { i32* } } }. That way, you wouldn't have a bunch of partial types floating around. -Eli
On Sat, 13 Sep 2008 11:45:50 -0700, Eli Friedman wrote:> On Sat, Sep 13, 2008 at 11:09 AM, Hendrik Boom <hendrik at topoi.pooq.com> > wrote: >> What I want is to be able to use the fields that have already been >> defined, even though the type isn't complete yet. The vector<const >> Type*> is all I have at that moment, and it isn't a type. But by the >> time I have a type it's frozen and I can't add new fields to it. >> >> Do I gather that I keep making new types, each slightly larger than the >> previous ones, cast each pointer to my growing type to the type-of-the- >> moment, and field-select from it; then finally complete the type when >> all is known? That might just work, if field-allocation is independent >> of later fields, but it is ugly. > > Field-allocation is guaranteed to be independent of later fields, so the > casting solution would work.Thanks for the idea. I was starting to despair about making the compiler as flexible as I wanted it without abandoning llvm.> > It might be slightly cleaner to define the types recursively... for > example, define a struct as { i32 { float { i32* } } }. That way, you > wouldn't have a bunch of partial types floating around.Just curious -- would struct{struct{i32, i8} i8} take just 6 bytes on the usual architectures? -- hendrik
On Sat, 13 Sep 2008 11:45:50 -0700, Eli Friedman wrote:> On Sat, Sep 13, 2008 at 11:09 AM, Hendrik Boom <hendrik at topoi.pooq.com> > wrote: >> What I want is to be able to use the fields that have already been >> defined, even though the type isn't complete yet. The vector<const >> Type*> is all I have at that moment, and it isn't a type. But by the >> time I have a type it's frozen and I can't add new fields to it. >> >> Do I gather that I keep making new types, each slightly larger than the >> previous ones, cast each pointer to my growing type to the type-of-the- >> moment, and field-select from it; then finally complete the type when >> all is known? That might just work, if field-allocation is independent >> of later fields, but it is ugly. > > Field-allocation is guaranteed to be independent of later fields, so the > casting solution would work. > > It might be slightly cleaner to define the types recursively... for > example, define a struct as { i32 { float { i32* } } }. That way, you > wouldn't have a bunch of partial types floating around. > > -EliLooking at this again, the conceptual problem is this. It's natural in writing a code generator to want to generate code out of order. Given a suitable representation of strings, or temporary files, it's rather easy to do this if the generated fore is text. You just make sure you can make insertions where you want, or stratify the code into different temporary files that are later concatenated, or something like that. With today's gigabyte RAM chips, this isn't a big deal. So all would be well generating llvm assembler. But then I see the API to llvm that allows one to build the llvm parse tree directly, without making a huge string that has to be written out and parsed. It seems designed for the typical case -- that code will be generated out of order. You can remember insertion points into the parse tree, and inject things as needed. Except that this does not work with types. llvm assembler has type declarations, which are as jugglable and expandable as any other piece of text -- until thep generated code is complete and everything is written out for reading and parsing. The parse tree, however, doesn't seem to have a syntax for type declarations -- it only has types, It is not a parse tree for the llvm assembler. It is something else, something slightly different, but different enough to cause trouble. And that's the whole difference. -- hendrik