Jay K via llvm-dev
2016-Jul-04 05:34 UTC
[llvm-dev] Status of stack walking in LLVM on Win64?
> Message: 3 > Date: Sun, 3 Jul 2016 17:49:50 -0700 > From: Michael Lewis via llvm-dev <llvm-dev at lists.llvm.org> > To: Hayden Livingston <halivingston at gmail.com> > Cc: llvm-dev <llvm-dev at lists.llvm.org> > Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? > Message-ID: > <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston <halivingston at gmail.com> > wrote: > >> For JITs it would appear that there is a patch needed for some kind of >> relocations. >> >> https://llvm.org/bugs/show_bug.cgi?id=24233 >> >> Is the patch really needed? What does it do? I'm not an expert here so >> asking. >> > > > I'm not really interested in the JIT case as I said originally, so I can't > answer that question. > > > >> >> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> I can confirm that LLVM emits correct data when used in an AoT >> configuration >>> for x64, exception handling would be totally broken without it. >>> >> > > > Two points of clarification: > > - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)? > Again given the presence of bugs going back to 2015 (including one linked > in this thread) and other scant data from the list, I really can't tell > what the expected state of this functionality is on Win64. > > - Are you referring to data generated by LLVM that is embedded in COFF > object files and then placed in the binary image by the linker? This data > is at a minimum relocated by link.exe on Windows as near as I can tell. I > do not want a dependency on link.exe. I can handle doing my own relocations > prior to emitting the final image, but I want to know if there's a turnkey > implementation of this already or if I have to roll my own here. > > Thanks, > > > > - MikeWindows/x64 ABI is pretty well documented. - The parameter passing is probably not the same as any other system. (Unless people are using LLVM for UEFI development?) Ignoring floating point, the first four integer parameters are in rcx, rdx, r8, r9. The rest are on the stack. - The exception handling might *resemble* other systems, but surely has unique details. - Ghere is absolutely an unremovable dependency on a linker; it doesn't have to be the Microsoft linker, I believe GNU ld already implements this. The documentation should be used. I can summarize and such, but it is documented. Roughly, ignoring parameter passing and focusing only on exception handling, it goes like this: - At any point in any program, "the stack" must be "unwindable". I've never seen this clearly described. It boils down to really "non volatile registers must be restorable" by "a runtime" via a documented/standardized metadata, such as to appear as if control was returned to any function on the call stack, w/o running any generated code in any of the functions between the current stack location and the resumed-to location. The stack pointer is often called out specially, but in fact it is just another non volatile register and not really a special case. So then some details: a "leaf function" is a function that does not change any non volatile registers, including the stack pointer. Leaf functions can do pretty much anything, but they must not change any non volatile registers -- which is a severe restriction. Have locals essentially makes you non-leaf -- even if you don't call anything. A leaf function is *not* a function that makes no calls, but calls do make a function a non-leaf, as it changes the stack pointer. The slight exception here is that all functions, including leaves, do have 4*8 bytes of scratch space in the stack available to them -- so local variables can be had, in that space and in volatile registers. The stack is walked from a leaf function merely by reading from rsp. A leaf function can make a syscall, so they aren't necessarily at the bottom of the stack. non-leaf functions are the interesting ones. They can change rsp, including such as via a call, and can change non-volatile registers, but all such changes (or rather, the saving of said registers) must be described by metadata, and the metadata must be findable -- via looking up a code address on the stack. Roughly speaking, all dlls have "pdata" -- procedure data. There are 3 UINT32s per non-leaf function. These are offsets into the image. Images are limited to 4GB in size. They are to the start of the function, end of the function, and to additional metadata. The additional metadata is called "xdata" or exception data. The offset to the metadata be be absent or 0, but that should be rare/nonexistant in practise -- it is for revealing leaf functions to static analysis for example. The "xdata" is then what describes how to restore non volatile registers, such as the order to pop them, or what offset they were saved at to the frame pointer or stack pointer (and which register if any is the frame pointer -- it doesn't have to be rbp, and most functions don't have one.) There are restrictions on code generation -- rsp changes and non volatile saves must be describable with this metadata. There is a notion of the end of the prologue, at this point all non volatiles that will be changed have been saved, and rsp changes are done. This is misleading though in that almost arbitrary code can be interleaved within the prologue, i.e. changes to volatile registers. As well, as a background, generally Windows/x64 functions don't change rsp, except in their prologue and the call instruction. They are not "pushy/poppp". However if a function uses _alloca, that is a contradiction. Such functions must have a frame pointer, such as rbp, though it doesn't have to be rbp and often is not. There is also a notion of chaining the data. This is useful when a function has "early out" paths that only change some non volatiles. Also there is allowance for discontiguous functions. Also there is no metadata for epilogues. If an exception occurs in an epilogue, the runtime actually look at the code being run, detects it is an epilogue and simulates it. As such, epilogue code generation is constrained. (and breakpoints within epilogues mess things up!) To repeat -- the unwindability is from any single instruction, be in the middle of a prologue, middle of an epilogue, or in the body of a function outside of prologue/epilogue. This unwindabilty serves both exception dispatch and debugger stack walking, and other things, like sampling profiler stack walking, or "leak tracking stack walking" -- stack walking is always possible, modulo bugs. The most common bugs are probably in hand written assemble, since assembly programmers have to do basically the work themselves. There is provision for providing the pdata at runtime for JITed code. The linker has to combine all the pdata and place a pointer (offset) to it in a documented place in the PE, similar to how imports and exports and base relocations are recorded. Anyway, see the documentation. - Jay
David Majnemer via llvm-dev
2016-Jul-04 06:05 UTC
[llvm-dev] Status of stack walking in LLVM on Win64?
On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Message: 3 > > Date: Sun, 3 Jul 2016 17:49:50 -0700 > > From: Michael Lewis via llvm-dev <llvm-dev at lists.llvm.org> > > To: Hayden Livingston <halivingston at gmail.com> > > Cc: llvm-dev <llvm-dev at lists.llvm.org> > > Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? > > Message-ID: > > <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston < > halivingston at gmail.com> > > wrote: > > > >> For JITs it would appear that there is a patch needed for some kind of > >> relocations. > >> > >> https://llvm.org/bugs/show_bug.cgi?id=24233 > >> > >> Is the patch really needed? What does it do? I'm not an expert here so > >> asking. > >> > > > > > > I'm not really interested in the JIT case as I said originally, so I > can't > > answer that question. > > > > > > > >> > >> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev > >> <llvm-dev at lists.llvm.org> wrote: > >>> I can confirm that LLVM emits correct data when used in an AoT > >> configuration > >>> for x64, exception handling would be totally broken without it. > >>> > >> > > > > > > Two points of clarification: > > > > - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)? > > Again given the presence of bugs going back to 2015 (including one linked > > in this thread) and other scant data from the list, I really can't tell > > what the expected state of this functionality is on Win64. > > > > - Are you referring to data generated by LLVM that is embedded in COFF > > object files and then placed in the binary image by the linker? This data > > is at a minimum relocated by link.exe on Windows as near as I can tell. I > > do not want a dependency on link.exe. I can handle doing my own > relocations > > prior to emitting the final image, but I want to know if there's a > turnkey > > implementation of this already or if I have to roll my own here. > > > > Thanks, > > > > > > > > - Mike > > > > Windows/x64 ABI is pretty well documented. > > > - The parameter passing is probably not the same as any other system. > (Unless people are using LLVM for UEFI development?) > Ignoring floating point, the first four integer parameters > are in rcx, rdx, r8, r9. The rest are on the stack. > > > - The exception handling might *resemble* other systems, but > surely has unique details. > > - Ghere is absolutely an unremovable dependency on a linker; > it doesn't have to be the Microsoft linker, I believe GNU ld > already implements this. > > The documentation should be used. > > I can summarize and such, but it is documented. > > Roughly, ignoring parameter passing and focusing only on exception > handling, > it goes like this: > > > - At any point in any program, "the stack" must be "unwindable". > I've never seen this clearly described. > It boils down to really "non volatile registers must be restorable" > by "a runtime" via a documented/standardized metadata, such as to > appear as if control was returned to any function on the call stack, > w/o running any generated code in any of the functions between > the current stack location and the resumed-to location. > > > The stack pointer is often called out specially, but in fact > it is just another non volatile register and not really a special > case. > > > So then some details: > a "leaf function" is a function that does not change any non > volatile registers, > including the stack pointer. Leaf functions can do pretty much > anything, > but they must not change any non volatile registers -- which is a > severe > restriction. Have locals essentially makes you non-leaf -- even if > you > don't call anything. A leaf function is *not* a function that makes > no calls, > but calls do make a function a non-leaf, as it changes the stack > pointer. > > > The slight exception here is that all functions, including leaves, > do have > 4*8 bytes of scratch space in the stack available to them -- so > local > variables can be had, in that space and in volatile registers. > > > The stack is walked from a leaf function merely by reading from rsp. > A leaf function can make a syscall, so they aren't necessarily at > the bottom of the stack. > > > non-leaf functions are the interesting ones. > They can change rsp, including such as via a call, and can change > non-volatile > registers, but all such changes (or rather, the saving of said > registers) must > be described by metadata, and the metadata > must be findable -- via looking up a code address on the stack. > > > Roughly speaking, all dlls have "pdata" -- procedure data. > There are 3 UINT32s per non-leaf function. > These are offsets into the image. Images are limited to 4GB in size. > They are to the start of the function, end of the function, and to > additional metadata. > The additional metadata is called "xdata" or exception data. > The offset to the metadata be be absent or 0, but that should be > rare/nonexistant > in practise -- it is for revealing leaf functions to static analysis > for example. > > > The "xdata" is then what describes how to restore non volatile > registers, > such as the order to pop them, or what offset they were saved at to > the > frame pointer or stack pointer (and which register if any is the > frame pointer -- it doesn't have to be rbp, > and most functions don't have one.) > > > There are restrictions on code generation -- rsp changes and non > volatile saves > must be describable with this metadata. There is a notion of the end > of the prologue, > at this point all non volatiles that will be changed have been > saved, and rsp changes > are done. This is misleading though in that almost arbitrary code > can be interleaved > within the prologue, i.e. changes to volatile registers. > > > As well, as a background, generally Windows/x64 functions don't > change rsp, > except in their prologue and the call instruction. > They are not "pushy/poppp". However if a function uses _alloca, that > is a contradiction. Such functions must have a frame pointer, such > as rbp, > though it doesn't have to be rbp and often is not. > > > There is also a notion of chaining the data. This is useful when > a function has "early out" paths that only change some non volatiles. > > > Also there is allowance for discontiguous functions. > > > Also there is no metadata for epilogues. If an exception occurs in > an epilogue, > the runtime actually look at the code being run, detects it is an > epilogue > and simulates it. As such, epilogue code generation is constrained. > (and breakpoints within epilogues mess things up!) >These is metadata for epilogues (UWOP_EPILOG) but it is only available on Windows 8.1 and newer.> > > To repeat -- the unwindability is from any single instruction, be in > the > middle of a prologue, middle of an epilogue, or in the body of a > function > outside of prologue/epilogue. > > > This unwindabilty serves both exception dispatch and debugger stack > walking, > and other things, like sampling profiler stack walking, or "leak > tracking > stack walking" -- stack walking is always possible, modulo bugs. > The most common bugs are probably in hand written assemble, since > assembly programmers have to do basically the work themselves. > > > There is provision for providing the pdata at runtime for JITed code. > > > The linker has to combine all the pdata and place a pointer (offset) > to it > in a documented place in the PE, similar to how imports and exports > and base > relocations are recorded. > > > Anyway, see the documentation. > > > - Jay > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160703/14fdf21a/attachment.html>
Jay K via llvm-dev
2016-Jul-04 06:22 UTC
[llvm-dev] Status of stack walking in LLVM on Win64?
> These is metadata for epilogues (UWOP_EPILOG) but it is only available on Windows 8.1 and newer. I'm aware of this. I believe it is so sampling profilers can walk the kernel stack including through paged code -- i.e. the epilogue data is not paged, while the related epilogue code might be. Do you see it used, i.e. in usermode? (where the pdata/xdata/code are all equally paged). It would allow for e.g. breakpoints in epilogues as well, but that doesn't seem to be a consideration. Perhaps debuggers are supposed to detect epilogues and use hardware breakpoints instead?? And ps, while the documentation is good, I think this basic point of what the goal is -- restoration of non-volatiles from arbitrary points, with the clarification/emphasis that rsp is a slightly special non-volatile -- is not clearly documented. It is from this motivation that everything pretty directly follows imho. For example, this is why all ymm registers are all volatile -- because the xdata design precedes their existence and therefore cannot describe their preservation/restoration. - Jay ________________________________> From: david.majnemer at gmail.com > Date: Sun, 3 Jul 2016 23:05:14 -0700 > Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? > To: jay.krell at cornell.edu > CC: llvm-dev at lists.llvm.org > > > > On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev > <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: >> Message: 3 >> Date: Sun, 3 Jul 2016 17:49:50 -0700 >> From: Michael Lewis via llvm-dev > <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> >> To: Hayden Livingston > <halivingston at gmail.com<mailto:halivingston at gmail.com>> >> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> >> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? >> Message-ID: >> > <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at mail.gmail.com<mailto:CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%2B9Edv2pi71Dw at mail.gmail.com>> >> Content-Type: text/plain; charset="utf-8" >> >> On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston > <halivingston at gmail.com<mailto:halivingston at gmail.com>> >> wrote: >> >>> For JITs it would appear that there is a patch needed for some kind of >>> relocations. >>> >>> https://llvm.org/bugs/show_bug.cgi?id=24233 >>> >>> Is the patch really needed? What does it do? I'm not an expert here so >>> asking. >>> >> >> >> I'm not really interested in the JIT case as I said originally, so I can't >> answer that question. >> >> >> >>> >>> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev >>> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: >>>> I can confirm that LLVM emits correct data when used in an AoT >>> configuration >>>> for x64, exception handling would be totally broken without it. >>>> >>> >> >> >> Two points of clarification: >> >> - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)? >> Again given the presence of bugs going back to 2015 (including one linked >> in this thread) and other scant data from the list, I really can't tell >> what the expected state of this functionality is on Win64. >> >> - Are you referring to data generated by LLVM that is embedded in COFF >> object files and then placed in the binary image by the linker? This data >> is at a minimum relocated by link.exe on Windows as near as I can tell. I >> do not want a dependency on link.exe. I can handle doing my own relocations >> prior to emitting the final image, but I want to know if there's a turnkey >> implementation of this already or if I have to roll my own here. >> >> Thanks, >> >> >> >> - Mike > > > > Windows/x64 ABI is pretty well documented. > > > - The parameter passing is probably not the same as any other system. > (Unless people are using LLVM for UEFI development?) > Ignoring floating point, the first four integer parameters > are in rcx, rdx, r8, r9. The rest are on the stack. > > > - The exception handling might *resemble* other systems, but > surely has unique details. > > - Ghere is absolutely an unremovable dependency on a linker; > it doesn't have to be the Microsoft linker, I believe GNU ld > already implements this. > > The documentation should be used. > > I can summarize and such, but it is documented. > > Roughly, ignoring parameter passing and focusing only on exception > handling, > it goes like this: > > > - At any point in any program, "the stack" must be "unwindable". > I've never seen this clearly described. > It boils down to really "non volatile registers must be restorable" > by "a runtime" via a documented/standardized metadata, such as to > appear as if control was returned to any function on the call stack, > w/o running any generated code in any of the functions between > the current stack location and the resumed-to location. > > > The stack pointer is often called out specially, but in fact > it is just another non volatile register and not really a > special case. > > > So then some details: > a "leaf function" is a function that does not change any non > volatile registers, > including the stack pointer. Leaf functions can do pretty much > anything, > but they must not change any non volatile registers -- which is > a severe > restriction. Have locals essentially makes you non-leaf -- even if you > don't call anything. A leaf function is *not* a function that > makes no calls, > but calls do make a function a non-leaf, as it changes the stack > pointer. > > > The slight exception here is that all functions, including > leaves, do have > 4*8 bytes of scratch space in the stack available to them -- so local > variables can be had, in that space and in volatile registers. > > > The stack is walked from a leaf function merely by reading from rsp. > A leaf function can make a syscall, so they aren't necessarily at > the bottom of the stack. > > > non-leaf functions are the interesting ones. > They can change rsp, including such as via a call, and can change > non-volatile > registers, but all such changes (or rather, the saving of said > registers) must > be described by metadata, and the metadata > must be findable -- via looking up a code address on the stack. > > > Roughly speaking, all dlls have "pdata" -- procedure data. > There are 3 UINT32s per non-leaf function. > These are offsets into the image. Images are limited to 4GB in size. > They are to the start of the function, end of the function, and > to additional metadata. > The additional metadata is called "xdata" or exception data. > The offset to the metadata be be absent or 0, but that should be > rare/nonexistant > in practise -- it is for revealing leaf functions to static > analysis for example. > > > The "xdata" is then what describes how to restore non volatile > registers, > such as the order to pop them, or what offset they were saved at to the > frame pointer or stack pointer (and which register if any is the > frame pointer -- it doesn't have to be rbp, > and most functions don't have one.) > > > There are restrictions on code generation -- rsp changes and non > volatile saves > must be describable with this metadata. There is a notion of the > end of the prologue, > at this point all non volatiles that will be changed have been > saved, and rsp changes > are done. This is misleading though in that almost arbitrary code > can be interleaved > within the prologue, i.e. changes to volatile registers. > > > As well, as a background, generally Windows/x64 functions don't > change rsp, > except in their prologue and the call instruction. > They are not "pushy/poppp". However if a function uses _alloca, that > is a contradiction. Such functions must have a frame pointer, > such as rbp, > though it doesn't have to be rbp and often is not. > > > There is also a notion of chaining the data. This is useful when > a function has "early out" paths that only change some non volatiles. > > > Also there is allowance for discontiguous functions. > > > Also there is no metadata for epilogues. If an exception occurs > in an epilogue, > the runtime actually look at the code being run, detects it is an > epilogue > and simulates it. As such, epilogue code generation is constrained. > (and breakpoints within epilogues mess things up!) > > These is metadata for epilogues (UWOP_EPILOG) but it is only available > on Windows 8.1 and newer. > > > > To repeat -- the unwindability is from any single instruction, be > in the > middle of a prologue, middle of an epilogue, or in the body of a > function > outside of prologue/epilogue. > > > This unwindabilty serves both exception dispatch and debugger > stack walking, > and other things, like sampling profiler stack walking, or "leak > tracking > stack walking" -- stack walking is always possible, modulo bugs. > The most common bugs are probably in hand written assemble, since > assembly programmers have to do basically the work themselves. > > > There is provision for providing the pdata at runtime for JITed code. > > > The linker has to combine all the pdata and place a pointer > (offset) to it > in a documented place in the PE, similar to how imports and > exports and base > relocations are recorded. > > > Anyway, see the documentation. > > > - Jay > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >