nicolas geoffray
2010-Sep-25 08:04 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
Hi Talin, On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote:> > > Many languages support the notion of a "value type". Value types are always > passed by value, unlike reference types which are always passed by > pointer. An example is the "struct" type in C#. Another example is a "tuple" > type. A value type which is a local variable lives on the stack as an > alloca, not on the heap. When a function is called with a value type as > argument, the callee gets its own copy of the argument, rather than sharing > a pointer with the caller. >Yes.> > Value types are represented in LLVM using structs, and may contain pointer > fields which need to be traced. > >Yes.> The way that I handle non-pointer types is to generate an array of field > offsets (containing the offset of each pointer field within the struct) as > the metadata argument to llvm.gcroot. This meta argument is then processed > in my GCStrategy, where I add the stack root offset to the offsets in the > field offset array, which yields the stack offsets of the actual pointers in > the call frame. > >Did you think of the alternative of calling llvm.gcroot on pointers in this struct? This requires to change the verifier to support non-alloca pointers in llvm.gcroot, but it makes the solution more general and cleaner: pointers given to llvm.gcroot only point to objects in the heap. I think that, originally, the purpose of the second argument of llvm.gcroot was to emit static type information. Nicolas> It's all pretty simple really. > > >> >> Nicolas >> >> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >> >>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>> audience. >>> > >>> > This patch relaxes the restriction on llvm.gcroot so that it can work >>> with non-pointer allocas. The only changes are to Verifier.cpp - it appears >>> from my testing that llvm.gcroot always worked fine with non-pointer >>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>> build an efficient stack crawler (an alternative to shadow-stack that uses >>> only static constant data structures.) >>> > >>> > Here's a deal: If you accept this patch, I'll write up an extensive >>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>> written, however without this patch the tutorial doesn't make any sense.) >>> >>> Hi Talin, >>> >>> I don't think anyone is really using the GC support, other than Nicolas >>> in VMKit. If he's ok with the change, I am too. Please make sure the dox >>> stay up to date though. >>> >>> -Chris >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> >> > > > -- > -- Talin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/5ee82079/attachment.html>
Talin
2010-Sep-25 16:38 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray < nicolas.geoffray at gmail.com> wrote:> Hi Talin, > > On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote: >> >> >> Many languages support the notion of a "value type". Value types are >> always passed by value, unlike reference types which are always passed by >> pointer. An example is the "struct" type in C#. Another example is a "tuple" >> type. A value type which is a local variable lives on the stack as an >> alloca, not on the heap. When a function is called with a value type as >> argument, the callee gets its own copy of the argument, rather than sharing >> a pointer with the caller. >> > > Yes. > > >> >> Value types are represented in LLVM using structs, and may contain pointer >> fields which need to be traced. >> >> > Yes. > > >> The way that I handle non-pointer types is to generate an array of field >> offsets (containing the offset of each pointer field within the struct) as >> the metadata argument to llvm.gcroot. This meta argument is then processed >> in my GCStrategy, where I add the stack root offset to the offsets in the >> field offset array, which yields the stack offsets of the actual pointers in >> the call frame. >> >> > > Did you think of the alternative of calling llvm.gcroot on pointers in this > struct? This requires to change the verifier to support non-alloca pointers > in llvm.gcroot, but it makes the solution more general and cleaner: pointers > given to llvm.gcroot only point to objects in the heap. > > I think that, originally, the purpose of the second argument of llvm.gcroot > was to emit static type information. >Let me give you a more complicated example to see why this won't work: Imagine I have a discriminated union type, whose type declaration looks like this: var x:int or String. The variable 'x' can be either an integer or a reference to a string object. In LLVM assembly, this data structure is represented by the following struct: { i1, String * } The 'i1' field (the 'disciminator') is used to determine what kind of value is currently stored in the union. If it's 0, then it's an int, and the structure will be cast to { i8, int } before extracting the value. If it's 1, then it's a String pointer. The compiler does not allow access to the wrong type - if the value it 0, the language does not allow you to extract the value as a String. Now, suppose we declare this as a local variable, so the union struct is contained within an alloca. We want to declare the String pointer as a root, but only if the discriminator is not 0. We can't determine this at compile time, instead the collector has to be smart enough to examine the union and determine whether it contains a pointer or not. In my compiler, what I do is to generate a callback function that can trace the object. This callback function is contained within a data structure that is passed as the metadata argument to llvm.gcroot. So my code looks like this (bit casts omitted for simplicity): %int_or_string = type { i8, String * } %x = alloca %int_or_string call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string) Where '.tracetable.int_or_string' is the static type information for the "int or string" type, containing both the field offsets and the callback function to test the value of the disciminator. Note that if I only declared the pointer as a root, then this wouldn't work - the collector needs access to the entire data structure in order to trace the object correctly. Also, I think this is the right solution - llvm.gcroot is only responsible for the offset of the base of the alloca, not for any of it's internal structure, which is the responsibility of the compiler and the GCStrategy.> Nicolas > > > >> It's all pretty simple really. >> >> >>> >>> Nicolas >>> >>> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >>> >>>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>>> audience. >>>> > >>>> > This patch relaxes the restriction on llvm.gcroot so that it can work >>>> with non-pointer allocas. The only changes are to Verifier.cpp - it appears >>>> from my testing that llvm.gcroot always worked fine with non-pointer >>>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>>> build an efficient stack crawler (an alternative to shadow-stack that uses >>>> only static constant data structures.) >>>> > >>>> > Here's a deal: If you accept this patch, I'll write up an extensive >>>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>>> written, however without this patch the tutorial doesn't make any sense.) >>>> >>>> Hi Talin, >>>> >>>> I don't think anyone is really using the GC support, other than Nicolas >>>> in VMKit. If he's ok with the change, I am too. Please make sure the dox >>>> stay up to date though. >>>> >>>> -Chris >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> >>> >> >> >> -- >> -- Talin >> > >-- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/01a3fe01/attachment.html>
nicolas geoffray
2010-Sep-25 17:51 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
I didn't have unions in mind - indeed you need some kind of static information in such a case. The GC infrastructure in LLVM having so little love, I think it is good if you can improve it in any ways, as well as defining new interfaces. Cheers, Nicolas On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote:> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray < > nicolas.geoffray at gmail.com> wrote: > >> Hi Talin, >> >> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote: >>> >>> >>> Many languages support the notion of a "value type". Value types are >>> always passed by value, unlike reference types which are always passed by >>> pointer. An example is the "struct" type in C#. Another example is a "tuple" >>> type. A value type which is a local variable lives on the stack as an >>> alloca, not on the heap. When a function is called with a value type as >>> argument, the callee gets its own copy of the argument, rather than sharing >>> a pointer with the caller. >>> >> >> Yes. >> >> >>> >>> Value types are represented in LLVM using structs, and may contain >>> pointer fields which need to be traced. >>> >>> >> Yes. >> >> >>> The way that I handle non-pointer types is to generate an array of field >>> offsets (containing the offset of each pointer field within the struct) as >>> the metadata argument to llvm.gcroot. This meta argument is then processed >>> in my GCStrategy, where I add the stack root offset to the offsets in the >>> field offset array, which yields the stack offsets of the actual pointers in >>> the call frame. >>> >>> >> >> Did you think of the alternative of calling llvm.gcroot on pointers in >> this struct? This requires to change the verifier to support non-alloca >> pointers in llvm.gcroot, but it makes the solution more general and cleaner: >> pointers given to llvm.gcroot only point to objects in the heap. >> >> I think that, originally, the purpose of the second argument of >> llvm.gcroot was to emit static type information. >> > > Let me give you a more complicated example to see why this won't work: > > Imagine I have a discriminated union type, whose type declaration looks > like this: > > var x:int or String. > > The variable 'x' can be either an integer or a reference to a string > object. In LLVM assembly, this data structure is represented by the > following struct: > > { i1, String * } > > The 'i1' field (the 'disciminator') is used to determine what kind of value > is currently stored in the union. If it's 0, then it's an int, and the > structure will be cast to { i8, int } before extracting the value. If it's > 1, then it's a String pointer. The compiler does not allow access to the > wrong type - if the value it 0, the language does not allow you to extract > the value as a String. > > Now, suppose we declare this as a local variable, so the union struct is > contained within an alloca. We want to declare the String pointer as a root, > but only if the discriminator is not 0. We can't determine this at compile > time, instead the collector has to be smart enough to examine the union and > determine whether it contains a pointer or not. > > In my compiler, what I do is to generate a callback function that can trace > the object. This callback function is contained within a data structure that > is passed as the metadata argument to llvm.gcroot. > > So my code looks like this (bit casts omitted for simplicity): > > %int_or_string = type { i8, String * } > %x = alloca %int_or_string > call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string) > > Where '.tracetable.int_or_string' is the static type information for the > "int or string" type, containing both the field offsets and the callback > function to test the value of the disciminator. > > Note that if I only declared the pointer as a root, then this wouldn't work > - the collector needs access to the entire data structure in order to trace > the object correctly. > > Also, I think this is the right solution - llvm.gcroot is only responsible > for the offset of the base of the alloca, not for any of it's internal > structure, which is the responsibility of the compiler and the GCStrategy. > > >> Nicolas >> >> >> >>> It's all pretty simple really. >>> >>> >>>> >>>> Nicolas >>>> >>>> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >>>> >>>>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>>>> audience. >>>>> > >>>>> > This patch relaxes the restriction on llvm.gcroot so that it can work >>>>> with non-pointer allocas. The only changes are to Verifier.cpp - it appears >>>>> from my testing that llvm.gcroot always worked fine with non-pointer >>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>>>> build an efficient stack crawler (an alternative to shadow-stack that uses >>>>> only static constant data structures.) >>>>> > >>>>> > Here's a deal: If you accept this patch, I'll write up an extensive >>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>>>> written, however without this patch the tutorial doesn't make any sense.) >>>>> >>>>> Hi Talin, >>>>> >>>>> I don't think anyone is really using the GC support, other than Nicolas >>>>> in VMKit. If he's ok with the change, I am too. Please make sure the dox >>>>> stay up to date though. >>>>> >>>>> -Chris >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>>> >>> >>> >>> -- >>> -- Talin >>> >> >> > > > -- > -- Talin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/a4f63cd6/attachment.html>
Reasonably Related Threads
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.