thr3ads.net - llvm dev - [llvm-dev] Status of stack walking in LLVM on Win64? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Jay K via llvm-dev

2016-Jul-04 05:34 UTC

[llvm-dev] Status of stack walking in LLVM on Win64?

> Message: 3
> Date: Sun, 3 Jul 2016 17:49:50 -0700
> From: Michael Lewis via llvm-dev <llvm-dev at lists.llvm.org>
> To: Hayden Livingston <halivingston at gmail.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64?
> Message-ID:
> <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at
mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston <halivingston at
gmail.com>
> wrote:
>
>> For JITs it would appear that there is a patch needed for some kind of
>> relocations.
>>
>> https://llvm.org/bugs/show_bug.cgi?id=24233
>>
>> Is the patch really needed? What does it do? I'm not an expert here
so
>> asking.
>>
>
>
> I'm not really interested in the JIT case as I said originally, so I
can't
> answer that question.
>
>
>
>>
>> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>> I can confirm that LLVM emits correct data when used in an AoT
>> configuration
>>> for x64, exception handling would be totally broken without it.
>>>
>>
>
>
> Two points of clarification:
>
> - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)?
> Again given the presence of bugs going back to 2015 (including one linked
> in this thread) and other scant data from the list, I really can't tell
> what the expected state of this functionality is on Win64.
>
> - Are you referring to data generated by LLVM that is embedded in COFF
> object files and then placed in the binary image by the linker? This data
> is at a minimum relocated by link.exe on Windows as near as I can tell. I
> do not want a dependency on link.exe. I can handle doing my own relocations
> prior to emitting the final image, but I want to know if there's a
turnkey
> implementation of this already or if I have to roll my own here.
>
> Thanks,
>
>
>
> - Mike

 Windows/x64 ABI is pretty well documented. 

 - The parameter passing is probably not the same as any other system.
   (Unless people are using LLVM for UEFI development?) 
   Ignoring floating point, the first four integer parameters
   are in rcx, rdx, r8, r9. The rest are on the stack. 

 - The exception handling might *resemble* other systems, but
   surely has unique details.

 - Ghere is absolutely an unremovable dependency on a linker;
   it doesn't have to be the Microsoft linker, I believe GNU ld
   already implements this.

   The documentation should be used.

   I can summarize and such, but it is documented.

   Roughly, ignoring parameter passing and focusing only on exception handling,
   it goes like this:

   - At any point in any program, "the stack" must be
"unwindable".
       I've never seen this clearly described.
       It boils down to really "non volatile registers must be
restorable"
       by "a runtime" via a documented/standardized metadata, such as
to
       appear as if control was returned to any function on the call stack,
       w/o running any generated code in any of the functions between
       the current stack location and the resumed-to location.

       The stack pointer is often called out specially, but in fact
       it is just another non volatile register and not really a special case.

     So then some details:
       a "leaf function" is a function that does not change any non
volatile registers,
       including the stack pointer. Leaf functions can do pretty much anything,
       but they must not change any non volatile registers -- which is a severe
       restriction. Have locals essentially makes you non-leaf -- even if you
       don't call anything. A leaf function is *not* a function that makes
no calls,
       but calls do make a function a non-leaf, as it changes the stack pointer.

       The slight exception here is that all functions, including leaves, do
have
       4*8 bytes of scratch space in the stack available to them -- so local
       variables can be had, in that space and in volatile registers.

      The stack is walked from a leaf function merely by reading from rsp. 
      A leaf function can make a syscall, so they aren't necessarily at the
bottom of the stack. 

      non-leaf functions are the interesting ones.
      They can change rsp, including such as via a call, and can change
non-volatile
      registers, but all such changes (or rather, the saving of said registers)
must
      be described by metadata, and the metadata
      must be findable -- via looking up a code address on the stack.

      Roughly speaking, all dlls have "pdata" -- procedure data.
      There are 3 UINT32s per non-leaf function.
      These are offsets into the image. Images are limited to 4GB in size.
      They are to the start of the function, end of the function, and to
additional metadata.
      The additional metadata is called "xdata" or exception data.
      The offset to the metadata be be absent or 0, but that should be
rare/nonexistant
      in practise -- it is for revealing leaf functions to static analysis for
example.

      The "xdata" is then what describes how to restore non volatile
registers,
      such as the order to pop them, or what offset they were saved at to the
      frame pointer or stack pointer (and which register if any is the frame
pointer -- it doesn't have to be rbp,
      and most functions don't have one.)

      There are restrictions on code generation -- rsp changes and non volatile
saves
      must be describable with this metadata. There is a notion of the end of
the prologue,
      at this point all non volatiles that will be changed have been saved, and
rsp changes
      are done. This is misleading though in that almost arbitrary code can be
interleaved
      within the prologue, i.e. changes to volatile registers.

      As well, as a background, generally Windows/x64 functions don't change
rsp,
      except in their prologue and the call instruction.
      They are not "pushy/poppp". However if a function uses _alloca,
that
      is a contradiction. Such functions must have a frame pointer, such as rbp,
      though it doesn't have to be rbp and often is not.

      There is also a notion of chaining the data. This is useful when
      a function has "early out" paths that only change some non
volatiles.

      Also there is allowance for discontiguous functions.

      Also there is no metadata for epilogues. If an exception occurs in an
epilogue,
      the runtime actually look at the code being run, detects it is an epilogue
      and simulates it. As such, epilogue code generation is constrained.
      (and breakpoints within epilogues mess things up!)

      To repeat -- the unwindability is from any single instruction, be in the
      middle of a prologue, middle of an epilogue, or in the body of a function
      outside of prologue/epilogue.

      This unwindabilty serves both exception dispatch and debugger stack
walking,
      and other things, like sampling profiler stack walking, or "leak
tracking
      stack walking" -- stack walking is always possible, modulo bugs.
      The most common bugs are probably in hand written assemble, since
      assembly programmers have to do basically the work themselves.

      There is provision for providing the pdata at runtime for JITed code.

      The linker has to combine all the pdata and place a pointer (offset) to it
      in a documented place in the PE, similar to how imports and exports and
base
      relocations are recorded.

      Anyway, see the documentation.

      - Jay

David Majnemer via llvm-dev

2016-Jul-04 06:05 UTC

head link

[llvm-dev] Status of stack walking in LLVM on Win64?

On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> > Message: 3
> > Date: Sun, 3 Jul 2016 17:49:50 -0700
> > From: Michael Lewis via llvm-dev <llvm-dev at lists.llvm.org>
> > To: Hayden Livingston <halivingston at gmail.com>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64?
> > Message-ID:
> > <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at
mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston <
> halivingston at gmail.com>
> > wrote:
> >
> >> For JITs it would appear that there is a patch needed for some
kind of
> >> relocations.
> >>
> >> https://llvm.org/bugs/show_bug.cgi?id=24233
> >>
> >> Is the patch really needed? What does it do? I'm not an expert
here so
> >> asking.
> >>
> >
> >
> > I'm not really interested in the JIT case as I said originally, so
I
> can't
> > answer that question.
> >
> >
> >
> >>
> >> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev
> >> <llvm-dev at lists.llvm.org> wrote:
> >>> I can confirm that LLVM emits correct data when used in an AoT
> >> configuration
> >>> for x64, exception handling would be totally broken without
it.
> >>>
> >>
> >
> >
> > Two points of clarification:
> >
> > - Are you talking about Win64 or just x64 in general (i.e.
*nix/MacOS)?
> > Again given the presence of bugs going back to 2015 (including one
linked
> > in this thread) and other scant data from the list, I really can't
tell
> > what the expected state of this functionality is on Win64.
> >
> > - Are you referring to data generated by LLVM that is embedded in COFF
> > object files and then placed in the binary image by the linker? This
data
> > is at a minimum relocated by link.exe on Windows as near as I can
tell. I
> > do not want a dependency on link.exe. I can handle doing my own
> relocations
> > prior to emitting the final image, but I want to know if there's a
> turnkey
> > implementation of this already or if I have to roll my own here.
> >
> > Thanks,
> >
> >
> >
> > - Mike
>
>
>
>  Windows/x64 ABI is pretty well documented.
>
>
>  - The parameter passing is probably not the same as any other system.
>    (Unless people are using LLVM for UEFI development?)
>    Ignoring floating point, the first four integer parameters
>    are in rcx, rdx, r8, r9. The rest are on the stack.
>
>
>  - The exception handling might *resemble* other systems, but
>    surely has unique details.
>
>  - Ghere is absolutely an unremovable dependency on a linker;
>    it doesn't have to be the Microsoft linker, I believe GNU ld
>    already implements this.
>
>    The documentation should be used.
>
>    I can summarize and such, but it is documented.
>
>    Roughly, ignoring parameter passing and focusing only on exception
> handling,
>    it goes like this:
>
>
>    - At any point in any program, "the stack" must be
"unwindable".
>        I've never seen this clearly described.
>        It boils down to really "non volatile registers must be
restorable"
>        by "a runtime" via a documented/standardized metadata,
such as to
>        appear as if control was returned to any function on the call stack,
>        w/o running any generated code in any of the functions between
>        the current stack location and the resumed-to location.
>
>
>        The stack pointer is often called out specially, but in fact
>        it is just another non volatile register and not really a special
> case.
>
>
>      So then some details:
>        a "leaf function" is a function that does not change any
non
> volatile registers,
>        including the stack pointer. Leaf functions can do pretty much
> anything,
>        but they must not change any non volatile registers -- which is a
> severe
>        restriction. Have locals essentially makes you non-leaf -- even if
> you
>        don't call anything. A leaf function is *not* a function that
makes
> no calls,
>        but calls do make a function a non-leaf, as it changes the stack
> pointer.
>
>
>        The slight exception here is that all functions, including leaves,
> do have
>        4*8 bytes of scratch space in the stack available to them -- so
> local
>        variables can be had, in that space and in volatile registers.
>
>
>       The stack is walked from a leaf function merely by reading from rsp.
>       A leaf function can make a syscall, so they aren't necessarily at
> the bottom of the stack.
>
>
>       non-leaf functions are the interesting ones.
>       They can change rsp, including such as via a call, and can change
> non-volatile
>       registers, but all such changes (or rather, the saving of said
> registers) must
>       be described by metadata, and the metadata
>       must be findable -- via looking up a code address on the stack.
>
>
>       Roughly speaking, all dlls have "pdata" -- procedure data.
>       There are 3 UINT32s per non-leaf function.
>       These are offsets into the image. Images are limited to 4GB in size.
>       They are to the start of the function, end of the function, and to
> additional metadata.
>       The additional metadata is called "xdata" or exception
data.
>       The offset to the metadata be be absent or 0, but that should be
> rare/nonexistant
>       in practise -- it is for revealing leaf functions to static analysis
> for example.
>
>
>       The "xdata" is then what describes how to restore non
volatile
> registers,
>       such as the order to pop them, or what offset they were saved at to
> the
>       frame pointer or stack pointer (and which register if any is the
> frame pointer -- it doesn't have to be rbp,
>       and most functions don't have one.)
>
>
>       There are restrictions on code generation -- rsp changes and non
> volatile saves
>       must be describable with this metadata. There is a notion of the end
> of the prologue,
>       at this point all non volatiles that will be changed have been
> saved, and rsp changes
>       are done. This is misleading though in that almost arbitrary code
> can be interleaved
>       within the prologue, i.e. changes to volatile registers.
>
>
>       As well, as a background, generally Windows/x64 functions don't
> change rsp,
>       except in their prologue and the call instruction.
>       They are not "pushy/poppp". However if a function uses
_alloca, that
>       is a contradiction. Such functions must have a frame pointer, such
> as rbp,
>       though it doesn't have to be rbp and often is not.
>
>
>       There is also a notion of chaining the data. This is useful when
>       a function has "early out" paths that only change some non
volatiles.
>
>
>       Also there is allowance for discontiguous functions.
>
>
>       Also there is no metadata for epilogues. If an exception occurs in
> an epilogue,
>       the runtime actually look at the code being run, detects it is an
> epilogue
>       and simulates it. As such, epilogue code generation is constrained.
>       (and breakpoints within epilogues mess things up!)
>
These is metadata for epilogues (UWOP_EPILOG) but it is only available on
Windows 8.1 and newer.

>
>
>       To repeat -- the unwindability is from any single instruction, be in
> the
>       middle of a prologue, middle of an epilogue, or in the body of a
> function
>       outside of prologue/epilogue.
>
>
>       This unwindabilty serves both exception dispatch and debugger stack
> walking,
>       and other things, like sampling profiler stack walking, or "leak
> tracking
>       stack walking" -- stack walking is always possible, modulo bugs.
>       The most common bugs are probably in hand written assemble, since
>       assembly programmers have to do basically the work themselves.
>
>
>       There is provision for providing the pdata at runtime for JITed code.
>
>
>       The linker has to combine all the pdata and place a pointer (offset)
> to it
>       in a documented place in the PE, similar to how imports and exports
> and base
>       relocations are recorded.
>
>
>       Anyway, see the documentation.
>
>
>       - Jay
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160703/14fdf21a/attachment.html>

Jay K via llvm-dev

2016-Jul-04 06:22 UTC

head link

[llvm-dev] Status of stack walking in LLVM on Win64?

 > These is metadata for epilogues (UWOP_EPILOG) but it is only available on
Windows 8.1 and newer.

I'm aware of this.
I believe it is so sampling profilers can walk the kernel stack including
through paged code -- i.e. the epilogue data is not paged, while the related
epilogue code might be.
Do you see it used, i.e. in usermode?  (where the pdata/xdata/code are all
equally paged).
It would allow for e.g. breakpoints in epilogues as well, but that doesn't
seem to be a consideration.
  Perhaps debuggers are supposed to detect epilogues and use hardware
breakpoints instead??


And ps, while the documentation is good, I think this basic point of what the
goal is -- restoration of non-volatiles from arbitrary points, with the
clarification/emphasis that rsp is a slightly special non-volatile -- is not
clearly documented.
It is from this motivation that everything pretty directly follows imho.


For example, this is why all ymm registers are all volatile -- because the xdata
design precedes their existence and therefore cannot describe their
preservation/restoration.


 - Jay

________________________________> From: david.majnemer at gmail.com 
> Date: Sun, 3 Jul 2016 23:05:14 -0700 
> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? 
> To: jay.krell at cornell.edu 
> CC: llvm-dev at lists.llvm.org 
> 
> 
> 
> On Sun, Jul 3, 2016 at 10:34 PM, Jay K via llvm-dev 
> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:
>> Message: 3 
>> Date: Sun, 3 Jul 2016 17:49:50 -0700 
>> From: Michael Lewis via llvm-dev 
> <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
>> To: Hayden Livingston 
> <halivingston at gmail.com<mailto:halivingston at gmail.com>> 
>> Cc: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>
>> Subject: Re: [llvm-dev] Status of stack walking in LLVM on Win64? 
>> Message-ID: 
>> 
> <CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD+9Edv2pi71Dw at
mail.gmail.com<mailto:CAEm7p3svyOi6JU6r_RCCtRfGhTgTHeRw-SR0iD%2B9Edv2pi71Dw
at mail.gmail.com>>
>> Content-Type: text/plain; charset="utf-8" 
>> 
>> On Sun, Jul 3, 2016 at 2:17 PM, Hayden Livingston 
> <halivingston at gmail.com<mailto:halivingston at gmail.com>> 
>> wrote: 
>> 
>>> For JITs it would appear that there is a patch needed for some kind
of
>>> relocations. 
>>> 
>>> https://llvm.org/bugs/show_bug.cgi?id=24233 
>>> 
>>> Is the patch really needed? What does it do? I'm not an expert
here so
>>> asking. 
>>> 
>> 
>> 
>> I'm not really interested in the JIT case as I said originally, so
I can't
>> answer that question. 
>> 
>> 
>> 
>>> 
>>> On Sun, Jul 3, 2016 at 2:48 AM, David Majnemer via llvm-dev 
>>> <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>> wrote:
>>>> I can confirm that LLVM emits correct data when used in an AoT 
>>> configuration 
>>>> for x64, exception handling would be totally broken without it.
>>>> 
>>> 
>> 
>> 
>> Two points of clarification: 
>> 
>> - Are you talking about Win64 or just x64 in general (i.e. *nix/MacOS)?
>> Again given the presence of bugs going back to 2015 (including one
linked
>> in this thread) and other scant data from the list, I really can't
tell
>> what the expected state of this functionality is on Win64. 
>> 
>> - Are you referring to data generated by LLVM that is embedded in COFF 
>> object files and then placed in the binary image by the linker? This
data
>> is at a minimum relocated by link.exe on Windows as near as I can tell.
I
>> do not want a dependency on link.exe. I can handle doing my own
relocations
>> prior to emitting the final image, but I want to know if there's a
turnkey
>> implementation of this already or if I have to roll my own here. 
>> 
>> Thanks, 
>> 
>> 
>> 
>> - Mike 
> 
> 
> 
> Windows/x64 ABI is pretty well documented. 
> 
> 
> - The parameter passing is probably not the same as any other system. 
> (Unless people are using LLVM for UEFI development?) 
> Ignoring floating point, the first four integer parameters 
> are in rcx, rdx, r8, r9. The rest are on the stack. 
> 
> 
> - The exception handling might *resemble* other systems, but 
> surely has unique details. 
> 
> - Ghere is absolutely an unremovable dependency on a linker; 
> it doesn't have to be the Microsoft linker, I believe GNU ld 
> already implements this. 
> 
> The documentation should be used. 
> 
> I can summarize and such, but it is documented. 
> 
> Roughly, ignoring parameter passing and focusing only on exception 
> handling, 
> it goes like this: 
> 
> 
> - At any point in any program, "the stack" must be
"unwindable".
> I've never seen this clearly described. 
> It boils down to really "non volatile registers must be
restorable"
> by "a runtime" via a documented/standardized metadata, such as to
> appear as if control was returned to any function on the call stack, 
> w/o running any generated code in any of the functions between 
> the current stack location and the resumed-to location. 
> 
> 
> The stack pointer is often called out specially, but in fact 
> it is just another non volatile register and not really a 
> special case. 
> 
> 
> So then some details: 
> a "leaf function" is a function that does not change any non 
> volatile registers, 
> including the stack pointer. Leaf functions can do pretty much 
> anything, 
> but they must not change any non volatile registers -- which is 
> a severe 
> restriction. Have locals essentially makes you non-leaf -- even if you 
> don't call anything. A leaf function is *not* a function that 
> makes no calls, 
> but calls do make a function a non-leaf, as it changes the stack 
> pointer. 
> 
> 
> The slight exception here is that all functions, including 
> leaves, do have 
> 4*8 bytes of scratch space in the stack available to them -- so local 
> variables can be had, in that space and in volatile registers. 
> 
> 
> The stack is walked from a leaf function merely by reading from rsp. 
> A leaf function can make a syscall, so they aren't necessarily at 
> the bottom of the stack. 
> 
> 
> non-leaf functions are the interesting ones. 
> They can change rsp, including such as via a call, and can change 
> non-volatile 
> registers, but all such changes (or rather, the saving of said 
> registers) must 
> be described by metadata, and the metadata 
> must be findable -- via looking up a code address on the stack. 
> 
> 
> Roughly speaking, all dlls have "pdata" -- procedure data. 
> There are 3 UINT32s per non-leaf function. 
> These are offsets into the image. Images are limited to 4GB in size. 
> They are to the start of the function, end of the function, and 
> to additional metadata. 
> The additional metadata is called "xdata" or exception data. 
> The offset to the metadata be be absent or 0, but that should be 
> rare/nonexistant 
> in practise -- it is for revealing leaf functions to static 
> analysis for example. 
> 
> 
> The "xdata" is then what describes how to restore non volatile 
> registers, 
> such as the order to pop them, or what offset they were saved at to the 
> frame pointer or stack pointer (and which register if any is the 
> frame pointer -- it doesn't have to be rbp, 
> and most functions don't have one.) 
> 
> 
> There are restrictions on code generation -- rsp changes and non 
> volatile saves 
> must be describable with this metadata. There is a notion of the 
> end of the prologue, 
> at this point all non volatiles that will be changed have been 
> saved, and rsp changes 
> are done. This is misleading though in that almost arbitrary code 
> can be interleaved 
> within the prologue, i.e. changes to volatile registers. 
> 
> 
> As well, as a background, generally Windows/x64 functions don't 
> change rsp, 
> except in their prologue and the call instruction. 
> They are not "pushy/poppp". However if a function uses _alloca,
that
> is a contradiction. Such functions must have a frame pointer, 
> such as rbp, 
> though it doesn't have to be rbp and often is not. 
> 
> 
> There is also a notion of chaining the data. This is useful when 
> a function has "early out" paths that only change some non
volatiles.
> 
> 
> Also there is allowance for discontiguous functions. 
> 
> 
> Also there is no metadata for epilogues. If an exception occurs 
> in an epilogue, 
> the runtime actually look at the code being run, detects it is an 
> epilogue 
> and simulates it. As such, epilogue code generation is constrained. 
> (and breakpoints within epilogues mess things up!) 
> 
> These is metadata for epilogues (UWOP_EPILOG) but it is only available 
> on Windows 8.1 and newer. 
> 
> 
> 
> To repeat -- the unwindability is from any single instruction, be 
> in the 
> middle of a prologue, middle of an epilogue, or in the body of a 
> function 
> outside of prologue/epilogue. 
> 
> 
> This unwindabilty serves both exception dispatch and debugger 
> stack walking, 
> and other things, like sampling profiler stack walking, or "leak 
> tracking 
> stack walking" -- stack walking is always possible, modulo bugs. 
> The most common bugs are probably in hand written assemble, since 
> assembly programmers have to do basically the work themselves. 
> 
> 
> There is provision for providing the pdata at runtime for JITed code. 
> 
> 
> The linker has to combine all the pdata and place a pointer 
> (offset) to it 
> in a documented place in the PE, similar to how imports and 
> exports and base 
> relocations are recorded. 
> 
> 
> Anyway, see the documentation. 
> 
> 
> - Jay 
> _______________________________________________ 
> LLVM Developers mailing list 
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> 
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev 
>

llvm dev - Jul 2016 - Status of stack walking in LLVM on Win64?

[llvm-dev] Status of stack walking in LLVM on Win64?

[llvm-dev] Status of stack walking in LLVM on Win64?

[llvm-dev] Status of stack walking in LLVM on Win64?