thr3ads.net - llvm dev - [LLVMdev] [RFC] Add Intel TSX HLE Support [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Michael Liao

2013-Feb-19 22:07 UTC

[LLVMdev] [RFC] Add Intel TSX HLE Support

Hi All,

I'd like to add HLE support in LLVM/clang consistent to GCC's style [1].
HLE from Intel TSX [2] is legacy compatible instruction set extension to
specify transactional region by adding XACQUIRE and XRELEASE prefixes.
To support that, GCC chooses the approach by extending the memory order
flag in __atomic_* builtins with target-specific memory model in high
bits (bit 31-16 for target-specific memory model, bit 15-0 for the
general memory model.) To follow the similar approach, I propose to
change LLVM/clang by adding:

+ a metadata 'targetflags' in LLVM atomic IR to pass this
  target-specific memory model hint

+ one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
XACQUIRE or XRELEASE hints
  This extra target flag is embedded into the SubclassData fields. The
following is rationale how such target flags are embedded into
SubclassData in SDNode

  here is the current SDNode class hierarchy of memory related nodes

  SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
                    |             + -> StoreSDNode
                    + -> AtomicSDNode
                    + -> MemIntrinsicSDNode

  here is the current SubclassData definitions:

  bit 0~1 : extension type used in LoadSDNode
  bit 0   : truncating store in StoreSDNode
  bit 2~4 : addressing mode in LSBaseNode
  bit 5   : volatile bit in MemSDNode
  bit 6   : non-temporal bit in MemSDNode
  bit 7   : invariant bit in MemSDNode
  bit 8~11: memory order in AtomicSDNode
  bit 12  : synch scope in AtomicSDNode

  Considering the class hierarchy, we could safely reused bit 0~1 as the
target flags in AtomicSDNode/MemIntrinsicNode
  
+ X86 backend is modified to generate additional XACQUIRE/XRELEASE
prefix based on the specified target flag


The following are details of each patch:

* 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch

This patch adds 'targetflags' support in AtomicSDNode and
MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded
its value into SubclassData. Currently, only two bits are defined.

* 0002-Add-HLE-target-feature.patch

This patch adds HLE feature and auto-detection support

* 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch

This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
support

* 0004-Enable-HLE-code-generation.patch

This patch enables HLE code generation by extending the current logic to
handle 'targetflags'.

* 0001-Add-target-flags-support-for-atomic-ops.patch

This patch adds target flags support in __atomic_* builtins. It splits
the whole 32-bit order word into high and low 16-bit parts. The low
16-bit is the original memory order and the high 16-bit will be
re-defined as target-specific flags and passed through 'targetflags'
metadata.

* 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch

It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
__ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
builtins.

Thanks for your time to review!

Yours
- Michael
---
[1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
[2] http://software.intel.com/sites/default/files/319433-014.pdf

Michael Liao

2013-Feb-19 22:11 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

It seems the mailing doesn't allow me to send message too big. Here is
the patch 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
Type: text/x-patch
Size: 31914 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/fff45da2/attachment.bin>

Michael Liao

2013-Feb-19 22:11 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

Here is the patch 0002-Add-HLE-target-feature.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
>

Michael Liao

2013-Feb-19 22:12 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

Here is the patch
0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
>

Michael Liao

2013-Feb-19 22:12 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

Here is the patch 0004-Enable-HLE-code-generation.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
>

Michael Liao

2013-Feb-19 22:13 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

oops! forget attaching it.

- michael

On Tue, 2013-02-19 at 14:11 -0800, Michael Liao wrote:> Here is the patch 0002-Add-HLE-target-feature.patch
> 
> Yours
> - Michael
> 
> On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:
> > Hi All,
> > 
> > I'd like to add HLE support in LLVM/clang consistent to GCC's
style [1].
> > HLE from Intel TSX [2] is legacy compatible instruction set extension
to
> > specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> > To support that, GCC chooses the approach by extending the memory
order
> > flag in __atomic_* builtins with target-specific memory model in high
> > bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> > general memory model.) To follow the similar approach, I propose to
> > change LLVM/clang by adding:
> > 
> > + a metadata 'targetflags' in LLVM atomic IR to pass this
> >   target-specific memory model hint
> > 
> > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to
specify
> > XACQUIRE or XRELEASE hints
> >   This extra target flag is embedded into the SubclassData fields. The
> > following is rationale how such target flags are embedded into
> > SubclassData in SDNode
> > 
> >   here is the current SDNode class hierarchy of memory related nodes
> > 
> >   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
> >                     |             + -> StoreSDNode
> >                     + -> AtomicSDNode
> >                     + -> MemIntrinsicSDNode
> > 
> >   here is the current SubclassData definitions:
> > 
> >   bit 0~1 : extension type used in LoadSDNode
> >   bit 0   : truncating store in StoreSDNode
> >   bit 2~4 : addressing mode in LSBaseNode
> >   bit 5   : volatile bit in MemSDNode
> >   bit 6   : non-temporal bit in MemSDNode
> >   bit 7   : invariant bit in MemSDNode
> >   bit 8~11: memory order in AtomicSDNode
> >   bit 12  : synch scope in AtomicSDNode
> > 
> >   Considering the class hierarchy, we could safely reused bit 0~1 as
the
> > target flags in AtomicSDNode/MemIntrinsicNode
> >   
> > + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> > prefix based on the specified target flag
> > 
> > 
> > The following are details of each patch:
> > 
> > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> > 
> > This patch adds 'targetflags' support in AtomicSDNode and
> > MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> > its value into SubclassData. Currently, only two bits are defined.
> > 
> > * 0002-Add-HLE-target-feature.patch
> > 
> > This patch adds HLE feature and auto-detection support
> > 
> > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> > 
> > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> > support
> > 
> > * 0004-Enable-HLE-code-generation.patch
> > 
> > This patch enables HLE code generation by extending the current logic
to
> > handle 'targetflags'.
> > 
> > * 0001-Add-target-flags-support-for-atomic-ops.patch
> > 
> > This patch adds target flags support in __atomic_* builtins. It splits
> > the whole 32-bit order word into high and low 16-bit parts. The low
> > 16-bit is the original memory order and the high 16-bit will be
> > re-defined as target-specific flags and passed through
'targetflags'
> > metadata.
> > 
> > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> > 
> > It adds '-m[no]hle' option to turn on HLE feature or not. Once
HLE
> > feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> > __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> > builtins.
> > 
> > Thanks for your time to review!
> > 
> > Yours
> > - Michael
> > ---
> > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> > [2] http://software.intel.com/sites/default/files/319433-014.pdf
> > 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-HLE-target-feature.patch
Type: text/x-patch
Size: 3856 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/387bc7f5/attachment.bin>

Michael Liao

2013-Feb-19 22:13 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

patch is attached. - michael

On Tue, 2013-02-19 at 14:12 -0800, Michael Liao wrote:> Here is the patch
> 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> Yours
> - Michael
> 
> On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:
> > Hi All,
> > 
> > I'd like to add HLE support in LLVM/clang consistent to GCC's
style [1].
> > HLE from Intel TSX [2] is legacy compatible instruction set extension
to
> > specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> > To support that, GCC chooses the approach by extending the memory
order
> > flag in __atomic_* builtins with target-specific memory model in high
> > bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> > general memory model.) To follow the similar approach, I propose to
> > change LLVM/clang by adding:
> > 
> > + a metadata 'targetflags' in LLVM atomic IR to pass this
> >   target-specific memory model hint
> > 
> > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to
specify
> > XACQUIRE or XRELEASE hints
> >   This extra target flag is embedded into the SubclassData fields. The
> > following is rationale how such target flags are embedded into
> > SubclassData in SDNode
> > 
> >   here is the current SDNode class hierarchy of memory related nodes
> > 
> >   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
> >                     |             + -> StoreSDNode
> >                     + -> AtomicSDNode
> >                     + -> MemIntrinsicSDNode
> > 
> >   here is the current SubclassData definitions:
> > 
> >   bit 0~1 : extension type used in LoadSDNode
> >   bit 0   : truncating store in StoreSDNode
> >   bit 2~4 : addressing mode in LSBaseNode
> >   bit 5   : volatile bit in MemSDNode
> >   bit 6   : non-temporal bit in MemSDNode
> >   bit 7   : invariant bit in MemSDNode
> >   bit 8~11: memory order in AtomicSDNode
> >   bit 12  : synch scope in AtomicSDNode
> > 
> >   Considering the class hierarchy, we could safely reused bit 0~1 as
the
> > target flags in AtomicSDNode/MemIntrinsicNode
> >   
> > + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> > prefix based on the specified target flag
> > 
> > 
> > The following are details of each patch:
> > 
> > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> > 
> > This patch adds 'targetflags' support in AtomicSDNode and
> > MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> > its value into SubclassData. Currently, only two bits are defined.
> > 
> > * 0002-Add-HLE-target-feature.patch
> > 
> > This patch adds HLE feature and auto-detection support
> > 
> > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> > 
> > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> > support
> > 
> > * 0004-Enable-HLE-code-generation.patch
> > 
> > This patch enables HLE code generation by extending the current logic
to
> > handle 'targetflags'.
> > 
> > * 0001-Add-target-flags-support-for-atomic-ops.patch
> > 
> > This patch adds target flags support in __atomic_* builtins. It splits
> > the whole 32-bit order word into high and low 16-bit parts. The low
> > 16-bit is the original memory order and the high 16-bit will be
> > re-defined as target-specific flags and passed through
'targetflags'
> > metadata.
> > 
> > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> > 
> > It adds '-m[no]hle' option to turn on HLE feature or not. Once
HLE
> > feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> > __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> > builtins.
> > 
> > Thanks for your time to review!
> > 
> > Yours
> > - Michael
> > ---
> > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> > [2] http://software.intel.com/sites/default/files/319433-014.pdf
> > 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
Type: text/x-patch
Size: 8779 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/7fb6afc9/attachment.bin>

Michael Liao

2013-Feb-19 22:14 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

here is the patch! - michael

On Tue, 2013-02-19 at 14:12 -0800, Michael Liao wrote:> Here is the patch 0004-Enable-HLE-code-generation.patch
> 
> Yours
> - Michael
> 
> On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:
> > Hi All,
> > 
> > I'd like to add HLE support in LLVM/clang consistent to GCC's
style [1].
> > HLE from Intel TSX [2] is legacy compatible instruction set extension
to
> > specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> > To support that, GCC chooses the approach by extending the memory
order
> > flag in __atomic_* builtins with target-specific memory model in high
> > bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> > general memory model.) To follow the similar approach, I propose to
> > change LLVM/clang by adding:
> > 
> > + a metadata 'targetflags' in LLVM atomic IR to pass this
> >   target-specific memory model hint
> > 
> > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to
specify
> > XACQUIRE or XRELEASE hints
> >   This extra target flag is embedded into the SubclassData fields. The
> > following is rationale how such target flags are embedded into
> > SubclassData in SDNode
> > 
> >   here is the current SDNode class hierarchy of memory related nodes
> > 
> >   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
> >                     |             + -> StoreSDNode
> >                     + -> AtomicSDNode
> >                     + -> MemIntrinsicSDNode
> > 
> >   here is the current SubclassData definitions:
> > 
> >   bit 0~1 : extension type used in LoadSDNode
> >   bit 0   : truncating store in StoreSDNode
> >   bit 2~4 : addressing mode in LSBaseNode
> >   bit 5   : volatile bit in MemSDNode
> >   bit 6   : non-temporal bit in MemSDNode
> >   bit 7   : invariant bit in MemSDNode
> >   bit 8~11: memory order in AtomicSDNode
> >   bit 12  : synch scope in AtomicSDNode
> > 
> >   Considering the class hierarchy, we could safely reused bit 0~1 as
the
> > target flags in AtomicSDNode/MemIntrinsicNode
> >   
> > + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> > prefix based on the specified target flag
> > 
> > 
> > The following are details of each patch:
> > 
> > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> > 
> > This patch adds 'targetflags' support in AtomicSDNode and
> > MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> > its value into SubclassData. Currently, only two bits are defined.
> > 
> > * 0002-Add-HLE-target-feature.patch
> > 
> > This patch adds HLE feature and auto-detection support
> > 
> > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> > 
> > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> > support
> > 
> > * 0004-Enable-HLE-code-generation.patch
> > 
> > This patch enables HLE code generation by extending the current logic
to
> > handle 'targetflags'.
> > 
> > * 0001-Add-target-flags-support-for-atomic-ops.patch
> > 
> > This patch adds target flags support in __atomic_* builtins. It splits
> > the whole 32-bit order word into high and low 16-bit parts. The low
> > 16-bit is the original memory order and the high 16-bit will be
> > re-defined as target-specific flags and passed through
'targetflags'
> > metadata.
> > 
> > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> > 
> > It adds '-m[no]hle' option to turn on HLE feature or not. Once
HLE
> > feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> > __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> > builtins.
> > 
> > Thanks for your time to review!
> > 
> > Yours
> > - Michael
> > ---
> > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> > [2] http://software.intel.com/sites/default/files/319433-014.pdf
> > 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Enable-HLE-code-generation.patch
Type: text/x-patch
Size: 58131 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/98963414/attachment.bin>

Michael Liao

2013-Feb-19 22:14 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

Here is the patch against clang,
0001-Add-target-flags-support-for-atomic-ops.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-target-flags-support-for-atomic-ops.patch
Type: text/x-patch
Size: 17724 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/ded01602/attachment.bin>

Michael Liao

2013-Feb-19 22:15 UTC

head link

[LLVMdev] [RFC] Add Intel TSX HLE Support

Here is the patch against clang,
0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch

Yours
- Michael

On Tue, 2013-02-19 at 14:07 -0800, Michael Liao wrote:> Hi All,
> 
> I'd like to add HLE support in LLVM/clang consistent to GCC's style
[1].
> HLE from Intel TSX [2] is legacy compatible instruction set extension to
> specify transactional region by adding XACQUIRE and XRELEASE prefixes.
> To support that, GCC chooses the approach by extending the memory order
> flag in __atomic_* builtins with target-specific memory model in high
> bits (bit 31-16 for target-specific memory model, bit 15-0 for the
> general memory model.) To follow the similar approach, I propose to
> change LLVM/clang by adding:
> 
> + a metadata 'targetflags' in LLVM atomic IR to pass this
>   target-specific memory model hint
> 
> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify
> XACQUIRE or XRELEASE hints
>   This extra target flag is embedded into the SubclassData fields. The
> following is rationale how such target flags are embedded into
> SubclassData in SDNode
> 
>   here is the current SDNode class hierarchy of memory related nodes
> 
>   SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode
>                     |             + -> StoreSDNode
>                     + -> AtomicSDNode
>                     + -> MemIntrinsicSDNode
> 
>   here is the current SubclassData definitions:
> 
>   bit 0~1 : extension type used in LoadSDNode
>   bit 0   : truncating store in StoreSDNode
>   bit 2~4 : addressing mode in LSBaseNode
>   bit 5   : volatile bit in MemSDNode
>   bit 6   : non-temporal bit in MemSDNode
>   bit 7   : invariant bit in MemSDNode
>   bit 8~11: memory order in AtomicSDNode
>   bit 12  : synch scope in AtomicSDNode
> 
>   Considering the class hierarchy, we could safely reused bit 0~1 as the
> target flags in AtomicSDNode/MemIntrinsicNode
>   
> + X86 backend is modified to generate additional XACQUIRE/XRELEASE
> prefix based on the specified target flag
> 
> 
> The following are details of each patch:
> 
> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch
> 
> This patch adds 'targetflags' support in AtomicSDNode and
> MemIntrinsicSDNode. It will check metadata 'targetflags' and
embedded
> its value into SubclassData. Currently, only two bits are defined.
> 
> * 0002-Add-HLE-target-feature.patch
> 
> This patch adds HLE feature and auto-detection support
> 
> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
> 
> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding
> support
> 
> * 0004-Enable-HLE-code-generation.patch
> 
> This patch enables HLE code generation by extending the current logic to
> handle 'targetflags'.
> 
> * 0001-Add-target-flags-support-for-atomic-ops.patch
> 
> This patch adds target flags support in __atomic_* builtins. It splits
> the whole 32-bit order word into high and low 16-bit parts. The low
> 16-bit is the original memory order and the high 16-bit will be
> re-defined as target-specific flags and passed through
'targetflags'
> metadata.
> 
> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
> 
> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE
> feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and
> __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic
> builtins.
> 
> Thanks for your time to review!
> 
> Yours
> - Michael
> ---
> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html
> [2] http://software.intel.com/sites/default/files/319433-014.pdf
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch
Type: text/x-patch
Size: 6539 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/0728094a/attachment.bin>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Feb 2013 - [LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

[LLVMdev] [RFC] Add Intel TSX HLE Support

Seemingly Similar Threads