On Apr 20, 2008, at 5:36 PM, Gordon Henriksen wrote:> The shadow stack walker is in the runtime directory with the semispace > heap example. The runtime directory is built to LLVM IR using llvm- > gcc. So it's skipped unless you configure llvm with llvm-gcc support.doh! That's how I missed the binary. thanks!> Since the semispace heap doesn't actually work (it's an example, at > best), I suggest you simply copy the stack visitor into your project; > it's only a dozen lines of code or so.Ok, copying; can't find ShadowStackEntry though. Even make in that dir doesn't work: /usr/local/llvm-2.2/runtime/GC/SemiSpace $ sudo make Password: llvm[0]: Compiling semispace.c for Release build (bytecode) semispace.c:107: error: expected specifier-qualifier-list before 'ShadowStackEntry' semispace.c:111: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token semispace.c: In function 'llvm_cg_walk_gcroots': semispace.c:114: error: 'StackEntry' undeclared (first use in this function) semispace.c:114: error: (Each undeclared identifier is reported only once semispace.c:114: error: for each function it appears in.) semispace.c:114: error: 'R' undeclared (first use in this function) make: *** [/usr/local/llvm-2.2/runtime/GC/SemiSpace/Release/ semispace.ll] Error 1 It *seems* like it could be StackEntry instead? Perhaps this is a type I must include / generate for my type system?>> >> %a = malloc i32 >> %pa = alloca i32* >> store i32* %a, i32** %pa >> >> %c = bitcast i32** %pa to i8** >> call void @llvm.gcroot(i8** %c, i8* null); *pa = 99; > > Note that the malloc instruction always allocates from the system > heap, not your managed heap; putting a malloc pointer into a GC > pointer will probably confuse your collector. So you'll likely need to > replace 'malloc i32' with some call into your own allocator.Yep, was going to get to that once I could bind; was trying one GC thing at a time. :)> Your allocator should probably bzero the memory before returning it; > malloc returns uninitialized memory, which will crash the collector if > you reach a collection point before completely initializing the > object.Will do that too :) Got a simple, complete t.ll file that works with the semispace thing? I could reproduce stuff from the shadowstack paper I guess. how does the gc "shadow-stack" gcroot intrinsic work exactly? I couldn't read the assembly very well. Seems my example above wouldn't work would it unless i create/fill in a shadow stack record? Taking a giant step back, I can build something similar to semispace.c myself so I'm in control of my world, right? i would set up the shadow stack using IR instructions and could avoid gcroot by notifying my collector as I see fit... Sorry I'm so lost...just trying to figure out what llvm does for me and what I have to do. Ter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080420/f3843606/attachment.html>
On 2008-04-20, at 21:05, Terence Parr wrote:> On Apr 20, 2008, at 5:36 PM, Gordon Henriksen wrote: > >> Since the semispace heap doesn't actually work (it's an example, at >> best), I suggest you simply copy the stack visitor into your >> project; it's only a dozen lines of code or so. > > > Ok, copying; can't find ShadowStackEntry though. Even make in that > dir doesn't work:Please use the version from subversion; this is broken in 2.2 release, unfortunately.> how does the gc "shadow-stack" gcroot intrinsic work exactly? I > couldn't read the assembly very well. Seems my example above > wouldn't work would it unless i create/fill in a shadow stack record? >'gc "shadow-stack"' in the LLVM IR instructs the code generator to automatically maintain the linked list of stack frames. You don't have to do anything to maintain these shadow stack frames except to keep your variables in the llvm.gcroot'd allocas. Essentially, it does this: struct ShadowStackEntry { ShadowStackLink *next; const ShadowStackMetadata *metadata; void *roots[0]; }; template <size_t count> struct Roots { ShadowStackLink *next; const ShadowStackMetadata *metadata; void *roots[0]; }; ShadowStackEntry *shadowStackHead; // Defined by the code generator. const ShadowStackMetadata f_metadata = ...; void f() { Roots<3> roots; roots.next = shadowStackHead; roots.metadata = f_metadata; roots.roots[0] = NULL; roots.roots[1] = NULL; roots.roots[2] = NULL; shadowStackHead = (ShadowStackEntry *) &roots; ... user code ... shadowStackHead = entry.next; // before any exit return; }> Taking a giant step back, I can build something similar to > semispace.c myself so I'm in control of my world, right? i would > set up the shadow stack using IR instructions and could avoid gcroot > by notifying my collector as I see fit...That's true; the shadow stack design is explicitly for uncooperative environments, after all. When you want to eliminate the shadow stack overhead, you will need to (a.) use a conservative GC or (b.) emit stack frame metadata using the LLVM GC support.> Sorry I'm so lost...just trying to figure out what llvm does for me > and what I have to do.No problem! Generally speaking, LLVM is going to help you find roots on the stack, which is the part that the compiler backend must help with; the rest is your playground. The infrastructure is more suited toward interfacing with an existing GC rather than necessarily making writing a new runtime trivial. (See exception handling for precedent…) — Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080420/ce58cb7b/attachment.html>
On Apr 20, 2008, at 6:52 PM, Gordon Henriksen wrote:> On 2008-04-20, at 21:05, Terence Parr wrote: > >> On Apr 20, 2008, at 5:36 PM, Gordon Henriksen wrote: >> >>> Since the semispace heap doesn't actually work (it's an example, >>> at best), I suggest you simply copy the stack visitor into your >>> project; it's only a dozen lines of code or so. >> >> >> Ok, copying; can't find ShadowStackEntry though. Even make in that >> dir doesn't work: > > Please use the version from subversion; this is broken in 2.2 > release, unfortunately.ah! ok, looks better now. :)>> how does the gc "shadow-stack" gcroot intrinsic work exactly? I >> couldn't read the assembly very well. Seems my example above >> wouldn't work would it unless i create/fill in a shadow stack record? >> > > 'gc "shadow-stack"' in the LLVM IR instructs the code generator to > automatically maintain the linked list of stack frames. You don't > have to do anything to maintain these shadow stack frames except to > keep your variables in the llvm.gcroot'd allocas. Essentially, it > does this: > > struct ShadowStackEntry { > ShadowStackLink *next; > const ShadowStackMetadata *metadata; > void *roots[0]; > };Ok, bear with me here... What's the difference between ShadowStackLink and ShadowStackEntry?> template <size_t count> > struct Roots { > ShadowStackLink *next; > const ShadowStackMetadata *metadata; > void *roots[0]; > }; > > ShadowStackEntry *shadowStackHead; > > // Defined by the code generator. > const ShadowStackMetadata f_metadata = ...;Do you mean generated by my front end that emits IR or do you mean the backend? It seems that, since I read the source code and build the symbol table, I would need to build this stack frame type information for LLVM.> void f() { > Roots<3> roots; > roots.next = shadowStackHead; > roots.metadata = f_metadata; > roots.roots[0] = NULL; > roots.roots[1] = NULL; > roots.roots[2] = NULL;What are the three roots here? Not sure where anything but the next, metadata are coming from. So the gc "shadow-stack" generates that preamble code? That would make sense> shadowStackHead = (ShadowStackEntry *) &roots; > > ... user code ...here is where my gcroots go then I guess.> shadowStackHead = entry.next; // before any exit > return; > }Can you tell me where to find ShadowStackMetadata? A search does not reveal it: /usr/local/llvm-2.2 $ find . -name 'ShadowStackMetadata*'>> Taking a giant step back, I can build something similar to >> semispace.c myself so I'm in control of my world, right? i would >> set up the shadow stack using IR instructions and could avoid >> gcroot by notifying my collector as I see fit... > > That's true; the shadow stack design is explicitly for uncooperative > environments, after all.The compiler plug-in for a GC is like a sophisticated macro that knows how to emit preambles and post ambles for each function that says it uses that particular GC, right? Does it do more than an include such as figuring out which alloca's I have that are pointers? If so, then why do I need to use gcroot instructions to identify roots? Seems like it would be much easier to understand to just have my output templates emit the preamble and so on. Oh, maybe the optimizer remove some stuff in there for what I think is a root is actually not around anymore.> When you want to eliminate the shadow stack overhead, you will need > to (a.) use a conservative GC or (b.) emit stack frame metadata > using the LLVM GC support.Unfortunately, I'm thoroughly confused about who generates what. Who is supposed to generate the meta data types? If I am, that is fine, but I really can't find anything in the documentation that is a simple end to end C code -> IR example. Once I get one together, I'll put it in the book I'm writing. I've spent many hours reading and playing as much as I can, but it is still not clear; 'course I ain't always that bright. ;) Note that the paper by Henderson was extremely clear to me, so it's not the contents, it is the details of using LLVM to do GC.>> Sorry I'm so lost...just trying to figure out what llvm does for me >> and what I have to do. > > No problem! > > Generally speaking, LLVM is going to help you find roots on the > stack, which is the part that the compiler backend must help with; > the rest is your playground.Is that because only code generation knows what roots exist after processing the IR?> The infrastructure is more suited toward interfacing with an > existing GC rather than necessarily making writing a new runtime > trivial. (See exception handling for precedent…)Well, writing a new garbage collector seems really straightforward (like to mark and sweep). LLVM will give me the roots and I am free to walk them. The part that I don't understand is who defines what metadata types and how exactly I make use of gcroot and LLVM's support. The concepts are clear, the details seem miles away ;) Thanks for all the help... Has anybody else on the list gotten a trivial GC'd language working I could look at? All go back to the scheme translator again to see what I can learn. Thanks, Ter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080421/b4f56100/attachment.html>