nicolas geoffray
2010-Sep-25 17:51 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
I didn't have unions in mind - indeed you need some kind of static information in such a case. The GC infrastructure in LLVM having so little love, I think it is good if you can improve it in any ways, as well as defining new interfaces. Cheers, Nicolas On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote:> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray < > nicolas.geoffray at gmail.com> wrote: > >> Hi Talin, >> >> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote: >>> >>> >>> Many languages support the notion of a "value type". Value types are >>> always passed by value, unlike reference types which are always passed by >>> pointer. An example is the "struct" type in C#. Another example is a "tuple" >>> type. A value type which is a local variable lives on the stack as an >>> alloca, not on the heap. When a function is called with a value type as >>> argument, the callee gets its own copy of the argument, rather than sharing >>> a pointer with the caller. >>> >> >> Yes. >> >> >>> >>> Value types are represented in LLVM using structs, and may contain >>> pointer fields which need to be traced. >>> >>> >> Yes. >> >> >>> The way that I handle non-pointer types is to generate an array of field >>> offsets (containing the offset of each pointer field within the struct) as >>> the metadata argument to llvm.gcroot. This meta argument is then processed >>> in my GCStrategy, where I add the stack root offset to the offsets in the >>> field offset array, which yields the stack offsets of the actual pointers in >>> the call frame. >>> >>> >> >> Did you think of the alternative of calling llvm.gcroot on pointers in >> this struct? This requires to change the verifier to support non-alloca >> pointers in llvm.gcroot, but it makes the solution more general and cleaner: >> pointers given to llvm.gcroot only point to objects in the heap. >> >> I think that, originally, the purpose of the second argument of >> llvm.gcroot was to emit static type information. >> > > Let me give you a more complicated example to see why this won't work: > > Imagine I have a discriminated union type, whose type declaration looks > like this: > > var x:int or String. > > The variable 'x' can be either an integer or a reference to a string > object. In LLVM assembly, this data structure is represented by the > following struct: > > { i1, String * } > > The 'i1' field (the 'disciminator') is used to determine what kind of value > is currently stored in the union. If it's 0, then it's an int, and the > structure will be cast to { i8, int } before extracting the value. If it's > 1, then it's a String pointer. The compiler does not allow access to the > wrong type - if the value it 0, the language does not allow you to extract > the value as a String. > > Now, suppose we declare this as a local variable, so the union struct is > contained within an alloca. We want to declare the String pointer as a root, > but only if the discriminator is not 0. We can't determine this at compile > time, instead the collector has to be smart enough to examine the union and > determine whether it contains a pointer or not. > > In my compiler, what I do is to generate a callback function that can trace > the object. This callback function is contained within a data structure that > is passed as the metadata argument to llvm.gcroot. > > So my code looks like this (bit casts omitted for simplicity): > > %int_or_string = type { i8, String * } > %x = alloca %int_or_string > call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string) > > Where '.tracetable.int_or_string' is the static type information for the > "int or string" type, containing both the field offsets and the callback > function to test the value of the disciminator. > > Note that if I only declared the pointer as a root, then this wouldn't work > - the collector needs access to the entire data structure in order to trace > the object correctly. > > Also, I think this is the right solution - llvm.gcroot is only responsible > for the offset of the base of the alloca, not for any of it's internal > structure, which is the responsibility of the compiler and the GCStrategy. > > >> Nicolas >> >> >> >>> It's all pretty simple really. >>> >>> >>>> >>>> Nicolas >>>> >>>> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >>>> >>>>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>>>> audience. >>>>> > >>>>> > This patch relaxes the restriction on llvm.gcroot so that it can work >>>>> with non-pointer allocas. The only changes are to Verifier.cpp - it appears >>>>> from my testing that llvm.gcroot always worked fine with non-pointer >>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>>>> build an efficient stack crawler (an alternative to shadow-stack that uses >>>>> only static constant data structures.) >>>>> > >>>>> > Here's a deal: If you accept this patch, I'll write up an extensive >>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>>>> written, however without this patch the tutorial doesn't make any sense.) >>>>> >>>>> Hi Talin, >>>>> >>>>> I don't think anyone is really using the GC support, other than Nicolas >>>>> in VMKit. If he's ok with the change, I am too. Please make sure the dox >>>>> stay up to date though. >>>>> >>>>> -Chris >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>>> >>> >>> >>> -- >>> -- Talin >>> >> >> > > > -- > -- Talin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/a4f63cd6/attachment.html>
Talin
2010-Sep-25 21:51 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
On Sat, Sep 25, 2010 at 10:51 AM, nicolas geoffray < nicolas.geoffray at gmail.com> wrote:> I didn't have unions in mind - indeed you need some kind of static > information in such a case. The GC infrastructure in LLVM having so little > love, I think it is good if you can improve it in any ways, as well as > defining new interfaces.So the patch is OK then? All it does is change the verifier -- llvm.gcroot already has the ability to do this, its just that the verifier wouldn't allow it.> > Cheers, > Nicolas > > > On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote: > >> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray < >> nicolas.geoffray at gmail.com> wrote: >> >>> Hi Talin, >>> >>> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote: >>>> >>>> >>>> Many languages support the notion of a "value type". Value types are >>>> always passed by value, unlike reference types which are always passed by >>>> pointer. An example is the "struct" type in C#. Another example is a "tuple" >>>> type. A value type which is a local variable lives on the stack as an >>>> alloca, not on the heap. When a function is called with a value type as >>>> argument, the callee gets its own copy of the argument, rather than sharing >>>> a pointer with the caller. >>>> >>> >>> Yes. >>> >>> >>>> >>>> Value types are represented in LLVM using structs, and may contain >>>> pointer fields which need to be traced. >>>> >>>> >>> Yes. >>> >>> >>>> The way that I handle non-pointer types is to generate an array of field >>>> offsets (containing the offset of each pointer field within the struct) as >>>> the metadata argument to llvm.gcroot. This meta argument is then processed >>>> in my GCStrategy, where I add the stack root offset to the offsets in the >>>> field offset array, which yields the stack offsets of the actual pointers in >>>> the call frame. >>>> >>>> >>> >>> Did you think of the alternative of calling llvm.gcroot on pointers in >>> this struct? This requires to change the verifier to support non-alloca >>> pointers in llvm.gcroot, but it makes the solution more general and cleaner: >>> pointers given to llvm.gcroot only point to objects in the heap. >>> >>> I think that, originally, the purpose of the second argument of >>> llvm.gcroot was to emit static type information. >>> >> >> Let me give you a more complicated example to see why this won't work: >> >> Imagine I have a discriminated union type, whose type declaration looks >> like this: >> >> var x:int or String. >> >> The variable 'x' can be either an integer or a reference to a string >> object. In LLVM assembly, this data structure is represented by the >> following struct: >> >> { i1, String * } >> >> The 'i1' field (the 'disciminator') is used to determine what kind of >> value is currently stored in the union. If it's 0, then it's an int, and the >> structure will be cast to { i8, int } before extracting the value. If it's >> 1, then it's a String pointer. The compiler does not allow access to the >> wrong type - if the value it 0, the language does not allow you to extract >> the value as a String. >> >> Now, suppose we declare this as a local variable, so the union struct is >> contained within an alloca. We want to declare the String pointer as a root, >> but only if the discriminator is not 0. We can't determine this at compile >> time, instead the collector has to be smart enough to examine the union and >> determine whether it contains a pointer or not. >> >> In my compiler, what I do is to generate a callback function that can >> trace the object. This callback function is contained within a data >> structure that is passed as the metadata argument to llvm.gcroot. >> >> So my code looks like this (bit casts omitted for simplicity): >> >> %int_or_string = type { i8, String * } >> %x = alloca %int_or_string >> call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string) >> >> Where '.tracetable.int_or_string' is the static type information for the >> "int or string" type, containing both the field offsets and the callback >> function to test the value of the disciminator. >> >> Note that if I only declared the pointer as a root, then this wouldn't >> work - the collector needs access to the entire data structure in order to >> trace the object correctly. >> >> Also, I think this is the right solution - llvm.gcroot is only responsible >> for the offset of the base of the alloca, not for any of it's internal >> structure, which is the responsibility of the compiler and the GCStrategy. >> >> >>> Nicolas >>> >>> >>> >>>> It's all pretty simple really. >>>> >>>> >>>>> >>>>> Nicolas >>>>> >>>>> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >>>>> >>>>>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>>>>> audience. >>>>>> > >>>>>> > This patch relaxes the restriction on llvm.gcroot so that it can >>>>>> work with non-pointer allocas. The only changes are to Verifier.cpp - it >>>>>> appears from my testing that llvm.gcroot always worked fine with non-pointer >>>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>>>>> build an efficient stack crawler (an alternative to shadow-stack that uses >>>>>> only static constant data structures.) >>>>>> > >>>>>> > Here's a deal: If you accept this patch, I'll write up an extensive >>>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>>>>> written, however without this patch the tutorial doesn't make any sense.) >>>>>> >>>>>> Hi Talin, >>>>>> >>>>>> I don't think anyone is really using the GC support, other than >>>>>> Nicolas in VMKit. If he's ok with the change, I am too. Please make sure >>>>>> the dox stay up to date though. >>>>>> >>>>>> -Chris >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> -- Talin >>>> >>> >>> >> >> >> -- >> -- Talin >> > >-- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100925/bdd57cb9/attachment.html>
nicolas geoffray
2010-Sep-26 06:37 UTC
[LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
Yes, it's definitely OK. In the future, I think the verifier will also be changed to support non-allocas in llvm.gcroot. Nicolas On Sat, Sep 25, 2010 at 11:51 PM, Talin <viridia at gmail.com> wrote:> On Sat, Sep 25, 2010 at 10:51 AM, nicolas geoffray < > nicolas.geoffray at gmail.com> wrote: > >> I didn't have unions in mind - indeed you need some kind of static >> information in such a case. The GC infrastructure in LLVM having so little >> love, I think it is good if you can improve it in any ways, as well as >> defining new interfaces. > > > So the patch is OK then? All it does is change the verifier -- llvm.gcroot > already has the ability to do this, its just that the verifier wouldn't > allow it. > >> >> Cheers, >> Nicolas >> >> >> On Sat, Sep 25, 2010 at 6:38 PM, Talin <viridia at gmail.com> wrote: >> >>> On Sat, Sep 25, 2010 at 1:04 AM, nicolas geoffray < >>> nicolas.geoffray at gmail.com> wrote: >>> >>>> Hi Talin, >>>> >>>> On Sat, Sep 25, 2010 at 4:18 AM, Talin <viridia at gmail.com> wrote: >>>>> >>>>> >>>>> Many languages support the notion of a "value type". Value types are >>>>> always passed by value, unlike reference types which are always passed by >>>>> pointer. An example is the "struct" type in C#. Another example is a "tuple" >>>>> type. A value type which is a local variable lives on the stack as an >>>>> alloca, not on the heap. When a function is called with a value type as >>>>> argument, the callee gets its own copy of the argument, rather than sharing >>>>> a pointer with the caller. >>>>> >>>> >>>> Yes. >>>> >>>> >>>>> >>>>> Value types are represented in LLVM using structs, and may contain >>>>> pointer fields which need to be traced. >>>>> >>>>> >>>> Yes. >>>> >>>> >>>>> The way that I handle non-pointer types is to generate an array of >>>>> field offsets (containing the offset of each pointer field within the >>>>> struct) as the metadata argument to llvm.gcroot. This meta argument is then >>>>> processed in my GCStrategy, where I add the stack root offset to the offsets >>>>> in the field offset array, which yields the stack offsets of the actual >>>>> pointers in the call frame. >>>>> >>>>> >>>> >>>> Did you think of the alternative of calling llvm.gcroot on pointers in >>>> this struct? This requires to change the verifier to support non-alloca >>>> pointers in llvm.gcroot, but it makes the solution more general and cleaner: >>>> pointers given to llvm.gcroot only point to objects in the heap. >>>> >>>> I think that, originally, the purpose of the second argument of >>>> llvm.gcroot was to emit static type information. >>>> >>> >>> Let me give you a more complicated example to see why this won't work: >>> >>> Imagine I have a discriminated union type, whose type declaration looks >>> like this: >>> >>> var x:int or String. >>> >>> The variable 'x' can be either an integer or a reference to a string >>> object. In LLVM assembly, this data structure is represented by the >>> following struct: >>> >>> { i1, String * } >>> >>> The 'i1' field (the 'disciminator') is used to determine what kind of >>> value is currently stored in the union. If it's 0, then it's an int, and the >>> structure will be cast to { i8, int } before extracting the value. If it's >>> 1, then it's a String pointer. The compiler does not allow access to the >>> wrong type - if the value it 0, the language does not allow you to extract >>> the value as a String. >>> >>> Now, suppose we declare this as a local variable, so the union struct is >>> contained within an alloca. We want to declare the String pointer as a root, >>> but only if the discriminator is not 0. We can't determine this at compile >>> time, instead the collector has to be smart enough to examine the union and >>> determine whether it contains a pointer or not. >>> >>> In my compiler, what I do is to generate a callback function that can >>> trace the object. This callback function is contained within a data >>> structure that is passed as the metadata argument to llvm.gcroot. >>> >>> So my code looks like this (bit casts omitted for simplicity): >>> >>> %int_or_string = type { i8, String * } >>> %x = alloca %int_or_string >>> call void llvm.gcroot( i8 ** x, i8* @.tracetable.int_or_string) >>> >>> Where '.tracetable.int_or_string' is the static type information for the >>> "int or string" type, containing both the field offsets and the callback >>> function to test the value of the disciminator. >>> >>> Note that if I only declared the pointer as a root, then this wouldn't >>> work - the collector needs access to the entire data structure in order to >>> trace the object correctly. >>> >>> Also, I think this is the right solution - llvm.gcroot is only >>> responsible for the offset of the base of the alloca, not for any of it's >>> internal structure, which is the responsibility of the compiler and the >>> GCStrategy. >>> >>> >>>> Nicolas >>>> >>>> >>>> >>>>> It's all pretty simple really. >>>>> >>>>> >>>>>> >>>>>> Nicolas >>>>>> >>>>>> On Fri, Sep 24, 2010 at 7:00 PM, Chris Lattner <clattner at apple.com>wrote: >>>>>> >>>>>>> On Sep 22, 2010, at 8:52 AM, Talin wrote: >>>>>>> > I'm moving this thread to llvm-dev in the hopes of reaching a wider >>>>>>> audience. >>>>>>> > >>>>>>> > This patch relaxes the restriction on llvm.gcroot so that it can >>>>>>> work with non-pointer allocas. The only changes are to Verifier.cpp - it >>>>>>> appears from my testing that llvm.gcroot always worked fine with non-pointer >>>>>>> allocas, except that the verifier wouldn't allow it. I've used this patch to >>>>>>> build an efficient stack crawler (an alternative to shadow-stack that uses >>>>>>> only static constant data structures.) >>>>>>> > >>>>>>> > Here's a deal: If you accept this patch, I'll write up an extensive >>>>>>> tutorial on how to write a stack crawler like mine. (Actually, it's already >>>>>>> written, however without this patch the tutorial doesn't make any sense.) >>>>>>> >>>>>>> Hi Talin, >>>>>>> >>>>>>> I don't think anyone is really using the GC support, other than >>>>>>> Nicolas in VMKit. If he's ok with the change, I am too. Please make sure >>>>>>> the dox stay up to date though. >>>>>>> >>>>>>> -Chris >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> -- Talin >>>>> >>>> >>>> >>> >>> >>> -- >>> -- Talin >>> >> >> > > > -- > -- Talin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100926/7f8e047f/attachment.html>
Seemingly Similar Threads
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.
- [LLVMdev] Patch to allow llvm.gcroot to work with non-pointer allocas.