Juergen Ributzka
2015-Jul-09  21:04 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Hi @ll,
over the past year we gained more experience with the
patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the
stackmap format.
The following proposal includes feedback and request from several interested
parties and I would like to hear your feedback.
Missing correlation between functions and stackmap records:
Originally the client had to keep track of the ID to know which stackmap record
belongs to which function, but this would stop working once functions are
inlined.
The new format fixes that by adding a direct reference from the function to the
stackmap records.
Call Size and Function Size:
These are additional information that are of interest and have been added to the
format.
@Swaroop: Could you please provide a little more detailed explanation on the
"Call Size" field and what exactly is there recorded. Is it just the
call instruction
or also the materialization code for the address? For what is this used for?
Flat format:
We think moving to a flat form will make parsing easier, because every record
has a fixed size and offsets can be calculated easily. The plan is to also
provide parsers for both stackmap versions (there is already one for the first
format in tree) and a corresponding C-API to make it easier for clients to
adopt the new format. There is no plan to drop the original format and we will
continue to support both formats. I will ask for feedback on the C API in a
separate RFC. 
Another benefit we hope to achieve from this format is to optimize for size by
uniquing entries - but that is independent optimization and not required.
More detailed frame record:
Clients require more information about the function frame, such as spilled
registers, etc. The frame base register i.e. might change when dynamic stack
realignment is performed on X86.
If there is anything missing please let me know.
Thanks
Cheers,
Juergen
Header v2 {
  uint8  : Stack Map Version (2)
  uint8  : Reserved [3] (0)
  uint32 : Constants Offset (bytes)
  uint32 : Frame Records Offset (bytes)
  uint32 : Frame Registers Offset (bytes)
  uint32 : StackMap Records Offset (bytes)
  uint32 : Locations Offset (bytes)
  uint32 : LiveOuts Offset (bytes)
}
align to 8 bytes
Constants[] {
  uint64 : LargeConstant
}
align to 8 bytes
FrameRecord[] {
  uint64 : Function Address
  uint32 : Function Size
  uint32 : Stack Size
  uint16 : Flags {
    bool : HasFrame
    bool : HasVariableSizeAlloca
    bool : HasStackRealignment
    bool : HasLiveOutInfo
    bool : Reserved [12]
  }
  uint16 : Frame Base Register Dwarf RegNum
  uint16 : Num Frame Registers
  uint16 : Frame Register Index
  uint16 : Num StackMap Records
  uint16 : StackMap Record Index
}
align to 4 bytes
FrameRegister[] {
  uint16 : Dwarf RegNum
  int16  : Offset
  uint8  : Size in Bytes
  uint8  : Flags {
    bool : IsSpilled
    bool : Reserved [7]
  }
}
align to 8 bytes
StackMapRecord[] {
  uint64 : PatchPoint ID
  uint32 : Instruction Offset
  uint8  : Call size (bytes)
  uint8  : Flags {
    bool : HasLiveOutInfo
    bool : Reserved [7]
  }
  uint16 : Num Locations
  uint16 : Location Index
  uint16 : Num LiveOuts
  uint16 : LiveOut Index
}
align to 4 bytes
Location[] {
  uint8  : Register | Direct | Indirect | Constant | ConstantIndex
  uint8  : Reserved (location flags)
  uint16 : Dwarf RegNum
  int32  : Offset or SmallConstant
}
align to 2 bytes
LiveOuts[] {
  uint16 : Dwarf RegNum
  uint8  : Reserved
  uint8  : Size in Bytes
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/702de814/attachment.html>
Andrew Trick
2015-Jul-09  21:19 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
> On Jul 9, 2015, at 2:04 PM, Juergen Ributzka <juergen at apple.com> wrote: > > Hi @ll, > > over the past year we gained more experience with the patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the stackmap format. > The following proposal includes feedback and request from several interested parties and I would like to hear your feedback.Looks really nice. Do we care about field alignment though?> Missing correlation between functions and stackmap records: > Originally the client had to keep track of the ID to know which stackmap record belongs to which function, but this would stop working once functions are inlined. > The new format fixes that by adding a direct reference from the function to the stackmap records. > > Call Size and Function Size: > These are additional information that are of interest and have been added to the format. > @Swaroop: Could you please provide a little more detailed explanation on the "Call Size" field and what exactly is there recorded. Is it just the call instruction > or also the materialization code for the address? For what is this used for?I think the number of patchable bytes should be recorded (including the call itself), but I don’t know exactly how Swaroop is using it. Andy> Flat format: > We think moving to a flat form will make parsing easier, because every record has a fixed size and offsets can be calculated easily. The plan is to also > provide parsers for both stackmap versions (there is already one for the first format in tree) and a corresponding C-API to make it easier for clients to > adopt the new format. There is no plan to drop the original format and we will continue to support both formats. I will ask for feedback on the C API in a > separate RFC. > > Another benefit we hope to achieve from this format is to optimize for size by uniquing entries - but that is independent optimization and not required. > > More detailed frame record: > Clients require more information about the function frame, such as spilled registers, etc. The frame base register i.e. might change when dynamic stack > realignment is performed on X86. > > > If there is anything missing please let me know. > > Thanks > > Cheers, > Juergen > > > Header v2 { > uint8 : Stack Map Version (2) > uint8 : Reserved [3] (0) > uint32 : Constants Offset (bytes) > uint32 : Frame Records Offset (bytes) > uint32 : Frame Registers Offset (bytes) > uint32 : StackMap Records Offset (bytes) > uint32 : Locations Offset (bytes) > uint32 : LiveOuts Offset (bytes) > } > > align to 8 bytes > Constants[] { > uint64 : LargeConstant > } > > align to 8 bytes > FrameRecord[] { > uint64 : Function Address > uint32 : Function Size > uint32 : Stack Size > uint16 : Flags { > bool : HasFrame > bool : HasVariableSizeAlloca > bool : HasStackRealignment > bool : HasLiveOutInfo > bool : Reserved [12] > } > uint16 : Frame Base Register Dwarf RegNum > uint16 : Num Frame Registers > uint16 : Frame Register Index > uint16 : Num StackMap Records > uint16 : StackMap Record Index > } > > align to 4 bytes > FrameRegister[] { > uint16 : Dwarf RegNum > int16 : Offset > uint8 : Size in Bytes > uint8 : Flags { > bool : IsSpilled > bool : Reserved [7] > } > } > > align to 8 bytes > StackMapRecord[] { > uint64 : PatchPoint ID > uint32 : Instruction Offset > uint8 : Call size (bytes) > uint8 : Flags { > bool : HasLiveOutInfo > bool : Reserved [7] > } > uint16 : Num Locations > uint16 : Location Index > uint16 : Num LiveOuts > uint16 : LiveOut Index > } > > align to 4 bytes > Location[] { > uint8 : Register | Direct | Indirect | Constant | ConstantIndex > uint8 : Reserved (location flags) > uint16 : Dwarf RegNum > int32 : Offset or SmallConstant > } > > align to 2 bytes > LiveOuts[] { > uint16 : Dwarf RegNum > uint8 : Reserved > uint8 : Size in Bytes > } >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/4eb3aa8b/attachment.html>
Andrew Trick
2015-Jul-09  21:34 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
> On Jul 9, 2015, at 2:19 PM, Andrew Trick <atrick at apple.com> wrote: > > >> On Jul 9, 2015, at 2:04 PM, Juergen Ributzka <juergen at apple.com <mailto:juergen at apple.com>> wrote: >> >> Hi @ll, >> >> over the past year we gained more experience with the patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the stackmap format. >> The following proposal includes feedback and request from several interested parties and I would like to hear your feedback. > > Looks really nice. Do we care about field alignment though?Never mind. I just noticed that you took care of natural alignment by reserving byte arrays (Reserved [3]) and adding “align to” notes before each set of records. Andy>> Missing correlation between functions and stackmap records: >> Originally the client had to keep track of the ID to know which stackmap record belongs to which function, but this would stop working once functions are inlined. >> The new format fixes that by adding a direct reference from the function to the stackmap records. >> >> Call Size and Function Size: >> These are additional information that are of interest and have been added to the format. >> @Swaroop: Could you please provide a little more detailed explanation on the "Call Size" field and what exactly is there recorded. Is it just the call instruction >> or also the materialization code for the address? For what is this used for? > > I think the number of patchable bytes should be recorded (including the call itself), but I don’t know exactly how Swaroop is using it. > > Andy > >> Flat format: >> We think moving to a flat form will make parsing easier, because every record has a fixed size and offsets can be calculated easily. The plan is to also >> provide parsers for both stackmap versions (there is already one for the first format in tree) and a corresponding C-API to make it easier for clients to >> adopt the new format. There is no plan to drop the original format and we will continue to support both formats. I will ask for feedback on the C API in a >> separate RFC. >> >> Another benefit we hope to achieve from this format is to optimize for size by uniquing entries - but that is independent optimization and not required. >> >> More detailed frame record: >> Clients require more information about the function frame, such as spilled registers, etc. The frame base register i.e. might change when dynamic stack >> realignment is performed on X86. >> >> >> If there is anything missing please let me know. >> >> Thanks >> >> Cheers, >> Juergen >> >> >> Header v2 { >> uint8 : Stack Map Version (2) >> uint8 : Reserved [3] (0) >> uint32 : Constants Offset (bytes) >> uint32 : Frame Records Offset (bytes) >> uint32 : Frame Registers Offset (bytes) >> uint32 : StackMap Records Offset (bytes) >> uint32 : Locations Offset (bytes) >> uint32 : LiveOuts Offset (bytes) >> } >> >> align to 8 bytes >> Constants[] { >> uint64 : LargeConstant >> } >> >> align to 8 bytes >> FrameRecord[] { >> uint64 : Function Address >> uint32 : Function Size >> uint32 : Stack Size >> uint16 : Flags { >> bool : HasFrame >> bool : HasVariableSizeAlloca >> bool : HasStackRealignment >> bool : HasLiveOutInfo >> bool : Reserved [12] >> } >> uint16 : Frame Base Register Dwarf RegNum >> uint16 : Num Frame Registers >> uint16 : Frame Register Index >> uint16 : Num StackMap Records >> uint16 : StackMap Record Index >> } >> >> align to 4 bytes >> FrameRegister[] { >> uint16 : Dwarf RegNum >> int16 : Offset >> uint8 : Size in Bytes >> uint8 : Flags { >> bool : IsSpilled >> bool : Reserved [7] >> } >> } >> >> align to 8 bytes >> StackMapRecord[] { >> uint64 : PatchPoint ID >> uint32 : Instruction Offset >> uint8 : Call size (bytes) >> uint8 : Flags { >> bool : HasLiveOutInfo >> bool : Reserved [7] >> } >> uint16 : Num Locations >> uint16 : Location Index >> uint16 : Num LiveOuts >> uint16 : LiveOut Index >> } >> >> align to 4 bytes >> Location[] { >> uint8 : Register | Direct | Indirect | Constant | ConstantIndex >> uint8 : Reserved (location flags) >> uint16 : Dwarf RegNum >> int32 : Offset or SmallConstant >> } >> >> align to 2 bytes >> LiveOuts[] { >> uint16 : Dwarf RegNum >> uint8 : Reserved >> uint8 : Size in Bytes >> } >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/1fa29b71/attachment.html>
Swaroop Sridhar
2015-Jul-09  22:33 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Regarding Call-site size specification:
CoreCLR (https://github.com/dotnet/coreclr) requires the size of the
Call-instruction to be reported in the GCInfo encoding.
The runtime performs querries for StackMap records using instruction offsets as
follows:
1)      Offset at the end of the call instruction (offset of next instruction-1)
if the call instruction occurs in code where GC can only take control at
safe-points.
2)      Offset of the start of the call instruction if the call instruction
occurs within a code-range that allows full interruption (that is, all
instructions in a range are considered safe-points)
LLVM/statepoint GC does not support option (2), but the CoreCLR's GC-table
Encoding has an interface designed to suite both modes.
void DefineCallSites(UINT32* pCallSites, BYTE* pCallSiteSizes, UINT32
numCallSites)
Therefore, it is helpful to have Call-Site size specified in StackMapRecord.
I agree with Andy, that the call-site size should include all bytes between the
start of the call instruction and the start of the next instruction.
Thanks,
Swaroop.
From: Juergen Ributzka [mailto:juergen at apple.com]
Sent: Thursday, July 9, 2015 2:04 PM
To: LLVM Dev
Cc: Lang Hames; Andrew Trick; Phil Pizlo; Philip Reames; Sanjoy Das; Swaroop
Sridhar; Russell Hadley
Subject: [RFC] New StackMap format proposal (StackMap v2)
Hi @ll,
over the past year we gained more experience with the
patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the
stackmap format.
The following proposal includes feedback and request from several interested
parties and I would like to hear your feedback.
Missing correlation between functions and stackmap records:
Originally the client had to keep track of the ID to know which stackmap record
belongs to which function, but this would stop working once functions are
inlined.
The new format fixes that by adding a direct reference from the function to the
stackmap records.
Call Size and Function Size:
These are additional information that are of interest and have been added to the
format.
@Swaroop: Could you please provide a little more detailed explanation on the
"Call Size" field and what exactly is there recorded. Is it just the
call instruction
or also the materialization code for the address? For what is this used for?
Flat format:
We think moving to a flat form will make parsing easier, because every record
has a fixed size and offsets can be calculated easily. The plan is to also
provide parsers for both stackmap versions (there is already one for the first
format in tree) and a corresponding C-API to make it easier for clients to
adopt the new format. There is no plan to drop the original format and we will
continue to support both formats. I will ask for feedback on the C API in a
separate RFC.
Another benefit we hope to achieve from this format is to optimize for size by
uniquing entries - but that is independent optimization and not required.
More detailed frame record:
Clients require more information about the function frame, such as spilled
registers, etc. The frame base register i.e. might change when dynamic stack
realignment is performed on X86.
If there is anything missing please let me know.
Thanks
Cheers,
Juergen
Header v2 {
  uint8  : Stack Map Version (2)
  uint8  : Reserved [3] (0)
  uint32 : Constants Offset (bytes)
  uint32 : Frame Records Offset (bytes)
  uint32 : Frame Registers Offset (bytes)
  uint32 : StackMap Records Offset (bytes)
  uint32 : Locations Offset (bytes)
  uint32 : LiveOuts Offset (bytes)
}
align to 8 bytes
Constants[] {
  uint64 : LargeConstant
}
align to 8 bytes
FrameRecord[] {
  uint64 : Function Address
  uint32 : Function Size
  uint32 : Stack Size
  uint16 : Flags {
    bool : HasFrame
    bool : HasVariableSizeAlloca
    bool : HasStackRealignment
    bool : HasLiveOutInfo
    bool : Reserved [12]
  }
  uint16 : Frame Base Register Dwarf RegNum
  uint16 : Num Frame Registers
  uint16 : Frame Register Index
  uint16 : Num StackMap Records
  uint16 : StackMap Record Index
}
align to 4 bytes
FrameRegister[] {
  uint16 : Dwarf RegNum
  int16  : Offset
  uint8  : Size in Bytes
  uint8  : Flags {
    bool : IsSpilled
    bool : Reserved [7]
  }
}
align to 8 bytes
StackMapRecord[] {
  uint64 : PatchPoint ID
  uint32 : Instruction Offset
  uint8  : Call size (bytes)
  uint8  : Flags {
    bool : HasLiveOutInfo
    bool : Reserved [7]
  }
  uint16 : Num Locations
  uint16 : Location Index
  uint16 : Num LiveOuts
  uint16 : LiveOut Index
}
align to 4 bytes
Location[] {
  uint8  : Register | Direct | Indirect | Constant | ConstantIndex
  uint8  : Reserved (location flags)
  uint16 : Dwarf RegNum
  int32  : Offset or SmallConstant
}
align to 2 bytes
LiveOuts[] {
  uint16 : Dwarf RegNum
  uint8  : Reserved
  uint8  : Size in Bytes
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/a29d6dbe/attachment.html>
Andrew Trick
2015-Jul-09  23:36 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
> On Jul 9, 2015, at 3:33 PM, Swaroop Sridhar <Swaroop.Sridhar at microsoft.com> wrote: > > Regarding Call-site size specification: > > CoreCLR (https://github.com/dotnet/coreclr <https://github.com/dotnet/coreclr>) requires the size of the Call-instruction to be reported in the GCInfo encoding. > > The runtime performs querries for StackMap records using instruction offsets as follows: > 1) Offset at the end of the call instruction (offset of next instruction-1) if the call instruction occurs in code where GC can only take control at safe-points.As part of this change it would be great if LLVM could now guarantee that the call will be emitted at the end of the patchable space. It currently happens to emit at the beginning, but makes no guarantee. Emitting at the end works better for tracking the return address.> 2) Offset of the start of the call instruction if the call instruction occurs within a code-range that allows full interruption (that is, all instructions in a range are considered safe-points)If the JIT requires knowledge of call's encoding, it should probably be emitting the call instruction itself within the patchable space reserved by LLVM. Note that the patchpoint may include a mov in addition to call, so the patchpoint address is not the same as the call address.> > LLVM/statepoint GC does not support option (2), but the CoreCLR’s GC-table Encoding has an interface designed to suite both modes. > void DefineCallSites(UINT32* pCallSites, BYTE* pCallSiteSizes, UINT32 numCallSites) > > Therefore, it is helpful to have Call-Site size specified in StackMapRecord. > I agree with Andy, that the call-site size should include all bytes between the start of the call instruction and the start of the next instruction.I suggested this because we want to support a dynamic callback that determines the patchpoint size (given the set of register arguments), and it would be nice to record that decision within the stack map. This is important information for any code that is responsible for patching because it must patch over all reserved bytes. The alternative would be for LLVM to emit the call at the end and record just the size of the call instruction encoding. That seems like a silly, x86-specific waste of stackmap space though. The JIT can either do the encoding and keep track of the info, or a small nop+move_immediate decoder can figure it out for all reasonable cases. Andy> > Thanks, > Swaroop. > > From: Juergen Ributzka [mailto:juergen at apple.com] > Sent: Thursday, July 9, 2015 2:04 PM > To: LLVM Dev > Cc: Lang Hames; Andrew Trick; Phil Pizlo; Philip Reames; Sanjoy Das; Swaroop Sridhar; Russell Hadley > Subject: [RFC] New StackMap format proposal (StackMap v2) > > Hi @ll, > > over the past year we gained more experience with the patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the stackmap format. > The following proposal includes feedback and request from several interested parties and I would like to hear your feedback. > > Missing correlation between functions and stackmap records: > Originally the client had to keep track of the ID to know which stackmap record belongs to which function, but this would stop working once functions are inlined. > The new format fixes that by adding a direct reference from the function to the stackmap records. > > Call Size and Function Size: > These are additional information that are of interest and have been added to the format. > @Swaroop: Could you please provide a little more detailed explanation on the "Call Size" field and what exactly is there recorded. Is it just the call instruction > or also the materialization code for the address? For what is this used for? > > Flat format: > We think moving to a flat form will make parsing easier, because every record has a fixed size and offsets can be calculated easily. The plan is to also > provide parsers for both stackmap versions (there is already one for the first format in tree) and a corresponding C-API to make it easier for clients to > adopt the new format. There is no plan to drop the original format and we will continue to support both formats. I will ask for feedback on the C API in a > separate RFC. > > Another benefit we hope to achieve from this format is to optimize for size by uniquing entries - but that is independent optimization and not required. > > More detailed frame record: > Clients require more information about the function frame, such as spilled registers, etc. The frame base register i.e. might change when dynamic stack > realignment is performed on X86. > > > If there is anything missing please let me know. > > Thanks > > Cheers, > Juergen > > > Header v2 { > uint8 : Stack Map Version (2) > uint8 : Reserved [3] (0) > uint32 : Constants Offset (bytes) > uint32 : Frame Records Offset (bytes) > uint32 : Frame Registers Offset (bytes) > uint32 : StackMap Records Offset (bytes) > uint32 : Locations Offset (bytes) > uint32 : LiveOuts Offset (bytes) > } > > align to 8 bytes > Constants[] { > uint64 : LargeConstant > } > > align to 8 bytes > FrameRecord[] { > uint64 : Function Address > uint32 : Function Size > uint32 : Stack Size > uint16 : Flags { > bool : HasFrame > bool : HasVariableSizeAlloca > bool : HasStackRealignment > bool : HasLiveOutInfo > bool : Reserved [12] > } > uint16 : Frame Base Register Dwarf RegNum > uint16 : Num Frame Registers > uint16 : Frame Register Index > uint16 : Num StackMap Records > uint16 : StackMap Record Index > } > > align to 4 bytes > FrameRegister[] { > uint16 : Dwarf RegNum > int16 : Offset > uint8 : Size in Bytes > uint8 : Flags { > bool : IsSpilled > bool : Reserved [7] > } > } > > align to 8 bytes > StackMapRecord[] { > uint64 : PatchPoint ID > uint32 : Instruction Offset > uint8 : Call size (bytes) > uint8 : Flags { > bool : HasLiveOutInfo > bool : Reserved [7] > } > uint16 : Num Locations > uint16 : Location Index > uint16 : Num LiveOuts > uint16 : LiveOut Index > } > > align to 4 bytes > Location[] { > uint8 : Register | Direct | Indirect | Constant | ConstantIndex > uint8 : Reserved (location flags) > uint16 : Dwarf RegNum > int32 : Offset or SmallConstant > } > > align to 2 bytes > LiveOuts[] { > uint16 : Dwarf RegNum > uint8 : Reserved > uint8 : Size in Bytes > } >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150709/702ea78e/attachment.html>
Hal Finkel
2015-Jul-09  23:59 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
----- Original Message -----> From: "Juergen Ributzka" <juergen at apple.com> > To: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Cc: "Lang Hames" <lhames at apple.com> > Sent: Thursday, July 9, 2015 4:04:20 PM > Subject: [LLVMdev] [RFC] New StackMap format proposal (StackMap v2) > > > Hi @ll, > > > over the past year we gained more experience with the > patchpoint/stackmap/statepoint intrinsics and it exposed limitations > in the stackmap format. > The following proposal includes feedback and request from several > interested parties and I would like to hear your feedback. > > > Missing correlation between functions and stackmap records: > Originally the client had to keep track of the ID to know which > stackmap record belongs to which function, but this would stop > working once functions are inlined. > The new format fixes that by adding a direct reference from the > function to the stackmap records. > > > Call Size and Function Size: > These are additional information that are of interest and have been > added to the format. > @Swaroop: Could you please provide a little more detailed explanation > on the "Call Size" field and what exactly is there recorded. Is it > just the call instruction > or also the materialization code for the address? For what is this > used for? > > > Flat format: > We think moving to a flat form will make parsing easier, because > every record has a fixed size and offsets can be calculated easily. > The plan is to also > provide parsers for both stackmap versions (there is already one for > the first format in tree) and a corresponding C-API to make it > easier for clients to > adopt the new format. There is no plan to drop the original format > and we will continue to support both formats. I will ask for > feedback on the C API in a > separate RFC. > > > Another benefit we hope to achieve from this format is to optimize > for size by uniquing entries - but that is independent optimization > and not required. > > > More detailed frame record: > Clients require more information about the function frame, such as > spilled registers, etc. The frame base register i.e. might change > when dynamic stack > realignment is performed on X86. > > > > > If there is anything missing please let me know. > > > Thanks > > > Cheers, > Juergen > > > > > > Header v2 { > uint8 : Stack Map Version (2) > uint8 : Reserved [3] (0) > uint32 : Constants Offset (bytes) > uint32 : Frame Records Offset (bytes) > uint32 : Frame Registers Offset (bytes) > uint32 : StackMap Records Offset (bytes) > uint32 : Locations Offset (bytes) > uint32 : LiveOuts Offset (bytes) > } > > align to 8 bytes > Constants[] { > uint64 : LargeConstant > } > > align to 8 bytes > FrameRecord[] { > uint64 : Function Address > uint32 : Function Size > uint32 : Stack Size > uint16 : Flags { > bool : HasFrame > bool : HasVariableSizeAlloca > bool : HasStackRealignment > bool : HasLiveOutInfo > bool : Reserved [12] > } > uint16 : Frame Base Register Dwarf RegNum > uint16 : Num Frame Registers > uint16 : Frame Register Index > uint16 : Num StackMap Records > uint16 : StackMap Record Index > } > > align to 4 bytes > FrameRegister[] { > uint16 : Dwarf RegNum > int16 : Offset > uint8 : Size in Bytes > uint8 : Flags { > bool : IsSpilled > bool : Reserved [7] > } > } > > align to 8 bytes > StackMapRecord[] { > uint64 : PatchPoint ID > uint32 : Instruction Offset > uint8 : Call size (bytes) > uint8 : Flags { > bool : HasLiveOutInfo > bool : Reserved [7] > } > uint16 : Num Locations > uint16 : Location Index > uint16 : Num LiveOuts > uint16 : LiveOut Index > }Do you guarantee that these will appear in order of increasing instruction offset? -Hal> > > align to 4 bytes > Location[] { > uint8 : Register | Direct | Indirect | Constant | ConstantIndex > uint8 : Reserved (location flags) > uint16 : Dwarf RegNum > int32 : Offset or SmallConstant > } > > align to 2 bytes > LiveOuts[] { > uint16 : Dwarf RegNum > uint8 : Reserved > uint8 : Size in Bytes > } > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Juergen Ributzka
2015-Jul-10  16:34 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Hi Hal, no, as far as I can recall we don’t make that guarantee for StackMap Records. Although, since we record the StackMap Records in function order and in instruction order inside a function this has always been true, but that wasn’t intentional. This format shouldn’t change this, but it also isn’t something that we programmatically enforce. Do you depend on that behavior? -Juergen> On Jul 9, 2015, at 4:59 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > Do you guarantee that these will appear in order of increasing instruction offset? > > -Hal-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150710/f0578e24/attachment.html>
Philip Reames
2015-Jul-15  06:02 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Comments inline. I tried to reply to particular points of discussion downthread, so this is probably best read last. On 07/09/2015 02:04 PM, Juergen Ributzka wrote:> Hi @ll, > > over the past year we gained more experience with the > patchpoint/stackmap/statepoint intrinsics and it exposed limitations > in the stackmap format. > The following proposal includes feedback and request from several > interested parties and I would like to hear your feedback. > > Missing correlation between functions and stackmap records: > Originally the client had to keep track of the ID to know which > stackmap record belongs to which function, but this would stop working > once functions are inlined. > The new format fixes that by adding a direct reference from the > function to the stackmap records.+1> > Call Size and Function Size: > These are additional information that are of interest and have been > added to the format. > @Swaroop: Could you please provide a little more detailed explanation > on the "Call Size" field and what exactly is there recorded. Is it > just the call instruction > or also the materialization code for the address? For what is this > used for?+1 subject to details being settled> > Flat format: > We think moving to a flat form will make parsing easier, because every > record has a fixed size and offsets can be calculated easily.I'm still not convinced this is actually a good idea, but I have no strong objection either.> The plan is to also > provide parsers for both stackmap versions (there is already one for > the first format in tree) and a corresponding C-API to make it easier > for clients to > adopt the new format. There is no plan to drop the original format and > we will continue to support both formats. I will ask for feedback on > the C API in a > separate RFC.I strongly feel that trying to support multiple versions of the file format in tree is a bad idea. It's a distraction, support burden, and provides extremely little value. What's the value in supporting a parser for a format that ToT will no longer be able to emit? Anyone who is actually consuming the old format would have to be using the old version in process anyways. (Possible exception: Is this intended to simplify the object caching model? If so, have we ever given any guarantees about object caching in the JIT across versions? That seems risky at best..) I'm also opposed to the complexity of C API, but I'll defer that until the separate RFC is actually posted.> > Another benefit we hope to achieve from this format is to optimize for > size by uniquing entries - but that is independent optimization and > not required. > > More detailed frame record: > Clients require more information about the function frame, such as > spilled registers, etc. The frame base register i.e. might change when > dynamic stack > realignment is performed on X86. > > > If there is anything missing please let me know. > > Thanks > > Cheers, > Juergen > >Is the format Little endian, bit endian? native endian?> Header v2 { > uint8 : Stack Map Version (2) > uint8 : Reserved [3] (0) > uint32 : Constants Offset (bytes) > uint32 : Frame Records Offset (bytes) > uint32 : Frame Registers Offset (bytes) > uint32 : StackMap Records Offset (bytes) > uint32 : Locations Offset (bytes) > uint32 : LiveOuts Offset (bytes)Let's reserve at least two uint32s for future field groups so that we can extend in a backwards compatible way. Another option would be to steal one of the reserved bytes and require parsers to ignore sections they don't understand. Actually, so long as we document that there might be unknown bytes between Header.end and Constants.begin we should be fine. It just needs to be *clearly* documented. As an example, if we need to add the SymbolConstants section mentioned down thread, that shouldn't require an version change.> } > > align to 8 bytes > Constants[] { > uint64 : LargeConstant > } > > align to 8 bytes > FrameRecord[] { > uint64 : Function Address > uint32 : Function Size > uint32 : Stack Size > uint16 : Flags { > bool : HasFrameWhat does this mean?> bool : HasVariableSizeAlloca > bool : HasStackRealignment > bool : HasLiveOutInfo > bool : Reserved [12]> } > uint16 : Frame Base Register Dwarf RegNumI don't think this is actually sufficient for dynamic stack realignment. Don't you end up with some things indexed off RSP and others indexed off RBP? Actually, wait.. that's entirely handled within the record format. What does tracking the Frame Base Register give us? I think I'm missing a use case here.> uint16 : Num Frame Registers > uint16 : Frame Register IndexAre these (begin, begin+length) offsets into the Frame Registers table? If so, should we swap them?> uint16 : Num StackMap Records > uint16 : StackMap Record Index > } > > align to 4 bytes > FrameRegister[] { > uint16 : Dwarf RegNum > int16 : Offset > uint8 : Size in Bytes > uint8 : Flags { > bool : IsSpilled > bool : Reserved [7] > } > }This seems tied specifically to the assumption that registers are spilled on entry. Given the recent work towards shrink wrapping, is that something we want to do? p.s. What's with the IsSpilled bit? Why would you have a record for an unspilled register?> > align to 8 bytes > StackMapRecord[] { > uint64 : PatchPoint ID"ID" (remove the patchpoint since statepoint now has one too)> uint32 : Instruction Offset > uint8 : Call size (bytes)"Patchable size" per down thread discussion> uint8 : Flags { > bool : HasLiveOutInfo > bool : Reserved [7] > }Why do we need these flags? Isn't this Num LiveOuts == 0?> uint16 : Num Locations > uint16 : Location Index > uint16 : Num LiveOuts > uint16 : LiveOut Index > } > > align to 4 bytes > Location[] { > uint8 : Register | Direct | Indirect | Constant | ConstantIndex > uint8 : Reserved (location flags) > uint16 : Dwarf RegNum > int32 : Offset or SmallConstant > } > > align to 2 bytes > LiveOuts[] { > uint16 : Dwarf RegNum > uint8 : Reserved > uint8 : Size in Bytes > } >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/f7b74ebc/attachment.html>
Hal Finkel
2015-Jul-15  06:15 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
----- Original Message -----> From: "Philip Reames" <listmail at philipreames.com> > To: "Juergen Ributzka" <juergen at apple.com>, "LLVM Dev" <llvmdev at cs.uiuc.edu> > Cc: "Lang Hames" <lhames at apple.com> > Sent: Wednesday, July 15, 2015 1:02:21 AM > Subject: Re: [LLVMdev] [RFC] New StackMap format proposal (StackMap v2) > > > > Comments inline. I tried to reply to particular points of discussion > downthread, so this is probably best read last. > > On 07/09/2015 02:04 PM, Juergen Ributzka wrote: > > > Hi @ll, > > > over the past year we gained more experience with the > patchpoint/stackmap/statepoint intrinsics and it exposed limitations > in the stackmap format. > The following proposal includes feedback and request from several > interested parties and I would like to hear your feedback. > > > Missing correlation between functions and stackmap records: > Originally the client had to keep track of the ID to know which > stackmap record belongs to which function, but this would stop > working once functions are inlined. > The new format fixes that by adding a direct reference from the > function to the stackmap records. +1 > > > > > > Call Size and Function Size: > These are additional information that are of interest and have been > added to the format. > @Swaroop: Could you please provide a little more detailed explanation > on the "Call Size" field and what exactly is there recorded. Is it > just the call instruction > or also the materialization code for the address? For what is this > used for? +1 subject to details being settled > > > > > > Flat format: > We think moving to a flat form will make parsing easier, because > every record has a fixed size and offsets can be calculated easily. > I'm still not convinced this is actually a good idea, but I have no > strong objection either. > > > The plan is to also > provide parsers for both stackmap versions (there is already one for > the first format in tree) and a corresponding C-API to make it > easier for clients to > adopt the new format. There is no plan to drop the original format > and we will continue to support both formats. I will ask for > feedback on the C API in a > separate RFC. > I strongly feel that trying to support multiple versions of the file > format in tree is a bad idea. It's a distraction, support burden, > and provides extremely little value. What's the value in supporting > a parser for a format that ToT will no longer be able to emit? > Anyone who is actually consuming the old format would have to be > using the old version in process anyways.I agree. There seems to be no reason to support the old format. We did label these 'experimental' after all. -Hal> > (Possible exception: Is this intended to simplify the object caching > model? If so, have we ever given any guarantees about object caching > in the JIT across versions? That seems risky at best..) > > I'm also opposed to the complexity of C API, but I'll defer that > until the separate RFC is actually posted. > > > > > > Another benefit we hope to achieve from this format is to optimize > for size by uniquing entries - but that is independent optimization > and not required. > > > More detailed frame record: > Clients require more information about the function frame, such as > spilled registers, etc. The frame base register i.e. might change > when dynamic stack > realignment is performed on X86. > > > > > If there is anything missing please let me know. > > > Thanks > > > Cheers, > Juergen > > > > Is the format Little endian, bit endian? native endian? > > > > > Header v2 { > uint8 : Stack Map Version (2) > uint8 : Reserved [3] (0) > uint32 : Constants Offset (bytes) > uint32 : Frame Records Offset (bytes) > uint32 : Frame Registers Offset (bytes) > uint32 : StackMap Records Offset (bytes) > uint32 : Locations Offset (bytes) > uint32 : LiveOuts Offset (bytes) Let's reserve at least two uint32s > for future field groups so that we can extend in a backwards > compatible way. Another option would be to steal one of the reserved > bytes and require parsers to ignore sections they don't understand. > > Actually, so long as we document that there might be unknown bytes > between Header.end and Constants.begin we should be fine. It just > needs to be *clearly* documented. > > As an example, if we need to add the SymbolConstants section > mentioned down thread, that shouldn't require an version change. > > > > > } > > align to 8 bytes > Constants[] { > uint64 : LargeConstant > } > > align to 8 bytes > FrameRecord[] { > uint64 : Function Address > uint32 : Function Size > uint32 : Stack Size > uint16 : Flags { > bool : HasFrame > What does this mean? > > > > > bool : HasVariableSizeAlloca > bool : HasStackRealignment > bool : HasLiveOutInfo > bool : Reserved [12] > > > > > > } > uint16 : Frame Base Register Dwarf RegNum > I don't think this is actually sufficient for dynamic stack > realignment. Don't you end up with some things indexed off RSP and > others indexed off RBP? Actually, wait.. that's entirely handled > within the record format. What does tracking the Frame Base Register > give us? I think I'm missing a use case here. > > > > > uint16 : Num Frame Registers > uint16 : Frame Register Index > Are these (begin, begin+length) offsets into the Frame Registers > table? If so, should we swap them? > > > > > uint16 : Num StackMap Records > uint16 : StackMap Record Index > } > > align to 4 bytes > FrameRegister[] { > uint16 : Dwarf RegNum > int16 : Offset > uint8 : Size in Bytes > uint8 : Flags { > bool : IsSpilled > bool : Reserved [7] > } > } > This seems tied specifically to the assumption that registers are > spilled on entry. Given the recent work towards shrink wrapping, is > that something we want to do? > > p.s. What's with the IsSpilled bit? Why would you have a record for > an unspilled register? > > > > > > align to 8 bytes > StackMapRecord[] { > uint64 : PatchPoint ID > "ID" (remove the patchpoint since statepoint now has one too) > > > > > uint32 : Instruction Offset > uint8 : Call size (bytes) > "Patchable size" per down thread discussion > > > > > uint8 : Flags { > bool : HasLiveOutInfo > bool : Reserved [7] > } > Why do we need these flags? Isn't this Num LiveOuts == 0? > > > > > uint16 : Num Locations > uint16 : Location Index > uint16 : Num LiveOuts > uint16 : LiveOut Index > } > > > align to 4 bytes > Location[] { > uint8 : Register | Direct | Indirect | Constant | ConstantIndex > uint8 : Reserved (location flags) > uint16 : Dwarf RegNum > int32 : Offset or SmallConstant > } > > align to 2 bytes > LiveOuts[] { > uint16 : Dwarf RegNum > uint8 : Reserved > uint8 : Size in Bytes > } > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Juergen Ributzka
2015-Jul-17  20:16 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
> On Jul 14, 2015, at 11:02 PM, Philip Reames <listmail at philipreames.com> wrote: > > Comments inline. I tried to reply to particular points of discussion downthread, so this is probably best read last. > > On 07/09/2015 02:04 PM, Juergen Ributzka wrote: >> Hi @ll, >> >> over the past year we gained more experience with the patchpoint/stackmap/statepoint intrinsics and it exposed limitations in the stackmap format. >> The following proposal includes feedback and request from several interested parties and I would like to hear your feedback. >> >> Missing correlation between functions and stackmap records: >> Originally the client had to keep track of the ID to know which stackmap record belongs to which function, but this would stop working once functions are inlined. >> The new format fixes that by adding a direct reference from the function to the stackmap records. > +1 >> >> Call Size and Function Size: >> These are additional information that are of interest and have been added to the format. >> @Swaroop: Could you please provide a little more detailed explanation on the "Call Size" field and what exactly is there recorded. Is it just the call instruction >> or also the materialization code for the address? For what is this used for? > +1 subject to details being settled >> >> Flat format: >> We think moving to a flat form will make parsing easier, because every record has a fixed size and offsets can be calculated easily. > I'm still not convinced this is actually a good idea, but I have no strong objection either. >> >> The plan is to also >> provide parsers for both stackmap versions (there is already one for the first format in tree) and a corresponding C-API to make it easier for clients to >> adopt the new format. There is no plan to drop the original format and we will continue to support both formats. I will ask for feedback on the C API in a >> separate RFC. > I strongly feel that trying to support multiple versions of the file format in tree is a bad idea. It's a distraction, support burden, and provides extremely little value. What's the value in supporting a parser for a format that ToT will no longer be able to emit? Anyone who is actually consuming the old format would have to be using the old version in process anyways.I don’t think supporting multiple version will be that hard and I don’t plan to keep all current and future version in tree forever. But we should provide a reasonable long transition period that allows clients to transition from one version to another - basically one release cycle.> > > (Possible exception: Is this intended to simplify the object caching model? If so, have we ever given any guarantees about object caching in the JIT across versions? That seems risky at best..) > > I'm also opposed to the complexity of C API, but I'll defer that until the separate RFC is actually posted. >> >> Another benefit we hope to achieve from this format is to optimize for size by uniquing entries - but that is independent optimization and not required. >> >> More detailed frame record: >> Clients require more information about the function frame, such as spilled registers, etc. The frame base register i.e. might change when dynamic stack >> realignment is performed on X86. >> >> >> If there is anything missing please let me know. >> >> Thanks >> >> Cheers, >> Juergen >> >> > Is the format Little endian, bit endian? native endian?I would say whatever the object file format is.>> Header v2 { >> uint8 : Stack Map Version (2) >> uint8 : Reserved [3] (0) >> uint32 : Constants Offset (bytes) >> uint32 : Frame Records Offset (bytes) >> uint32 : Frame Registers Offset (bytes) >> uint32 : StackMap Records Offset (bytes) >> uint32 : Locations Offset (bytes) >> uint32 : LiveOuts Offset (bytes) > Let's reserve at least two uint32s for future field groups so that we can extend in a backwards compatible way. Another option would be to steal one of the reserved bytes and require parsers to ignore sections they don't understand. > > Actually, so long as we document that there might be unknown bytes between Header.end and Constants.begin we should be fine. It just needs to be *clearly* documented.That was the idea. That way we could add new sections and old parsers would still work. They would just ignore those sections. Although with the addition of official parser in trees I am not sure how important this is anymore.> > As an example, if we need to add the SymbolConstants section mentioned down thread, that shouldn't require an version change.Once we release a LLVM version any change should require a version change.> >> } >> >> align to 8 bytes >> Constants[] { >> uint64 : LargeConstant >> } >> >> align to 8 bytes >> FrameRecord[] { >> uint64 : Function Address >> uint32 : Function Size >> uint32 : Stack Size >> uint16 : Flags { >> bool : HasFrame > What does this mean?If the function has a frame pointer.> >> bool : HasVariableSizeAlloca >> bool : HasStackRealignment >> bool : HasLiveOutInfo >> bool : Reserved [12] > >> } >> uint16 : Frame Base Register Dwarf RegNum > I don't think this is actually sufficient for dynamic stack realignment. Don't you end up with some things indexed off RSP and others indexed off RBP? Actually, wait.. that's entirely handled within the record format. What does tracking the Frame Base Register give us? I think I'm missing a use case here. >> uint16 : Num Frame Registers >> uint16 : Frame Register Index > Are these (begin, begin+length) offsets into the Frame Registers table? If so, should we swap them?Sure>> uint16 : Num StackMap Records >> uint16 : StackMap Record Index >> } >> >> align to 4 bytes >> FrameRegister[] { >> uint16 : Dwarf RegNum >> int16 : Offset >> uint8 : Size in Bytes >> uint8 : Flags { >> bool : IsSpilled >> bool : Reserved [7] >> } >> } > This seems tied specifically to the assumption that registers are spilled on entry. Given the recent work towards shrink wrapping, is that something we want to do? > > p.s. What's with the IsSpilled bit? Why would you have a record for an unspilled register?This was done before shrink wrapping and pristine register support. The original problem was that the live-out set didn’t contain pristine registers and we have to add all callee-saved register to the live-out set too to make sure we didn’t miss anything. We could do better by recording which callee-save registers actually got spilled and which not.>> >> align to 8 bytes >> StackMapRecord[] { >> uint64 : PatchPoint ID > "ID" (remove the patchpoint since statepoint now has one too)sure>> uint32 : Instruction Offset >> uint8 : Call size (bytes) > "Patchable size" per down thread discussion >> uint8 : Flags { >> bool : HasLiveOutInfo >> bool : Reserved [7] >> } > Why do we need these flags? Isn't this Num LiveOuts == 0?A verification that the information was actually computed - we might have 0 live-outs too. We could also change this in the future to attach this information only to call-sites that have an attribute that request this information.>> uint16 : Num Locations >> uint16 : Location Index >> uint16 : Num LiveOuts >> uint16 : LiveOut Index >> } >> >> align to 4 bytes >> Location[] { >> uint8 : Register | Direct | Indirect | Constant | ConstantIndex >> uint8 : Reserved (location flags) >> uint16 : Dwarf RegNum >> int32 : Offset or SmallConstant >> } >> >> align to 2 bytes >> LiveOuts[] { >> uint16 : Dwarf RegNum >> uint8 : Reserved >> uint8 : Size in Bytes >> } >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/6a9608d7/attachment.html>