Hi All, I'd like to add HLE support in LLVM/clang consistent to GCC's style [1]. HLE from Intel TSX [2] is legacy compatible instruction set extension to specify transactional region by adding XACQUIRE and XRELEASE prefixes. To support that, GCC chooses the approach by extending the memory order flag in __atomic_* builtins with target-specific memory model in high bits (bit 31-16 for target-specific memory model, bit 15-0 for the general memory model.) To follow the similar approach, I propose to change LLVM/clang by adding: + a metadata 'targetflags' in LLVM atomic IR to pass this target-specific memory model hint + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify XACQUIRE or XRELEASE hints This extra target flag is embedded into the SubclassData fields. The following is rationale how such target flags are embedded into SubclassData in SDNode here is the current SDNode class hierarchy of memory related nodes SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode | + -> StoreSDNode + -> AtomicSDNode + -> MemIntrinsicSDNode here is the current SubclassData definitions: bit 0~1 : extension type used in LoadSDNode bit 0 : truncating store in StoreSDNode bit 2~4 : addressing mode in LSBaseNode bit 5 : volatile bit in MemSDNode bit 6 : non-temporal bit in MemSDNode bit 7 : invariant bit in MemSDNode bit 8~11: memory order in AtomicSDNode bit 12 : synch scope in AtomicSDNode Considering the class hierarchy, we could safely reused bit 0~1 as the target flags in AtomicSDNode/MemIntrinsicNode + X86 backend is modified to generate additional XACQUIRE/XRELEASE prefix based on the specified target flag The following are details of each patch: * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch This patch adds 'targetflags' support in AtomicSDNode and MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded its value into SubclassData. Currently, only two bits are defined. * 0002-Add-HLE-target-feature.patch This patch adds HLE feature and auto-detection support * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding support * 0004-Enable-HLE-code-generation.patch This patch enables HLE code generation by extending the current logic to handle 'targetflags'. * 0001-Add-target-flags-support-for-atomic-ops.patch This patch adds target flags support in __atomic_* builtins. It splits the whole 32-bit order word into high and low 16-bit parts. The low 16-bit is the original memory order and the high 16-bit will be re-defined as target-specific flags and passed through 'targetflags' metadata. * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic builtins. Thanks for your time to review! Yours - Michael --- [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html [2] http://software.intel.com/sites/default/files/319433-014.pdf -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch Type: text/x-patch Size: 31914 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Add-HLE-target-feature.patch Type: text/x-patch Size: 3856 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment-0001.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch Type: text/x-patch Size: 8779 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment-0002.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-Enable-HLE-code-generation.patch Type: text/x-patch Size: 58131 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment-0003.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Add-target-flags-support-for-atomic-ops.patch Type: text/x-patch Size: 17724 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment-0004.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch Type: text/x-patch Size: 6539 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130219/80b09a00/attachment-0005.bin>
Hi Michael, Why do you want to add transactional memory support to LLVM ? Can't you implement transactional memory using a library call ? Judging by the number of patches it looks like a major change to LLVM, and I am not sure that I understand the motivation for including it in LLVM. Thanks, Nadav On Feb 19, 2013, at 11:52 AM, Michael Liao <michael.liao at intel.com> wrote:> Hi All, > > I'd like to add HLE support in LLVM/clang consistent to GCC's style [1]. HLE from Intel TSX [2] is legacy compatible instruction set extension to > specify transactional region by adding XACQUIRE and XRELEASE prefixes. To support that, GCC chooses the approach by extending the memory order > flag in __atomic_* builtins with target-specific memory model in high bits (bit 31-16 for target-specific memory model, bit 15-0 for the general memory model.) To follow the similar approach, I propose to > change LLVM/clang by adding: > > + a metadata 'targetflags' in LLVM atomic IR to pass this > target-specific memory model hint > > + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify XACQUIRE or XRELEASE hints > This extra target flag is embedded into the SubclassData fields. The following is rationale how such target flags are embedded into SubclassData in SDNode > > here is the current SDNode class hierarchy of memory related nodes > > SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode > | + -> StoreSDNode > + -> AtomicSDNode > + -> MemIntrinsicSDNode > > here is the current SubclassData definitions: > > bit 0~1 : extension type used in LoadSDNode > bit 0 : truncating store in StoreSDNode > bit 2~4 : addressing mode in LSBaseNode > bit 5 : volatile bit in MemSDNode > bit 6 : non-temporal bit in MemSDNode > bit 7 : invariant bit in MemSDNode > bit 8~11: memory order in AtomicSDNode > bit 12 : synch scope in AtomicSDNode > > Considering the class hierarchy, we could safely reused bit 0~1 as the target flags in AtomicSDNode/MemIntrinsicNode > > + X86 backend is modified to generate additional XACQUIRE/XRELEASE prefix based on the specified target flag > > > The following are details of each patch: > > * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch > > This patch adds 'targetflags' support in AtomicSDNode and MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded its value into SubclassData. Currently, only two bits are defined. > > * 0002-Add-HLE-target-feature.patch > > This patch adds HLE feature and auto-detection support > > * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch > > This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding support > > * 0004-Enable-HLE-code-generation.patch > > This patch enables HLE code generation by extending the current logic to handle 'targetflags'. > > * 0001-Add-target-flags-support-for-atomic-ops.patch > > This patch adds target flags support in __atomic_* builtins. It splits the whole 32-bit order word into high and low 16-bit parts. The low 16-bit is the original memory order and the high 16-bit will be re-defined as target-specific flags and passed through 'targetflags' metadata. > > * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch > > It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic builtins. > > Thanks for your time to review! > > Yours > - Michael > --- > [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html > [2] http://software.intel.com/sites/default/files/319433-014.pdf > > <0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch><0002-Add-HLE-target-feature.patch><0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch><0004-Enable-HLE-code-generation.patch><0001-Add-target-flags-support-for-atomic-ops.patch><0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch>_______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
Nadav, I've been reading over the patches and I was wondering if you could elaborate your concerns here. I share your goal of reducing compilation time regressions for users that don't care about new feature X. From my very quick glance over the patches, I didn't see anything I couldn't opt out of. Maybe we can talk about specifics and figure out a way to make these changes not affect other users/targets. Let's say I care about a non-X86, non-TM target and compilation time. What's the negative impact to me from these patches? Is there a cost I can't opt out of? Let's say I care about X86 non-TM compilation time, what additional costs am I burdened with? On Feb 19, 2013, at 12:54 PM, Nadav Rotem <nrotem at apple.com> wrote:> Hi Michael, > > Why do you want to add transactional memory support to LLVM ? Can't you implement transactional memory using a library call ? Judging by the number of patches it looks like a major change to LLVM, and I am not sure that I understand the motivation for including it in LLVM. > > Thanks, > Nadav > > > On Feb 19, 2013, at 11:52 AM, Michael Liao <michael.liao at intel.com> wrote: > >> Hi All, >> >> I'd like to add HLE support in LLVM/clang consistent to GCC's style [1]. HLE from Intel TSX [2] is legacy compatible instruction set extension to >> specify transactional region by adding XACQUIRE and XRELEASE prefixes. To support that, GCC chooses the approach by extending the memory order >> flag in __atomic_* builtins with target-specific memory model in high bits (bit 31-16 for target-specific memory model, bit 15-0 for the general memory model.) To follow the similar approach, I propose to >> change LLVM/clang by adding: >> >> + a metadata 'targetflags' in LLVM atomic IR to pass this >> target-specific memory model hint >> >> + one extra target flag in AtomicSDNode & MemIntrinsicSDNode to specify XACQUIRE or XRELEASE hints >> This extra target flag is embedded into the SubclassData fields. The following is rationale how such target flags are embedded into SubclassData in SDNode >> >> here is the current SDNode class hierarchy of memory related nodes >> >> SDNode -> MemSDNode -> LSBaseNode -> LoadSDNode >> | + -> StoreSDNode >> + -> AtomicSDNode >> + -> MemIntrinsicSDNode >> >> here is the current SubclassData definitions: >> >> bit 0~1 : extension type used in LoadSDNode >> bit 0 : truncating store in StoreSDNode >> bit 2~4 : addressing mode in LSBaseNode >> bit 5 : volatile bit in MemSDNode >> bit 6 : non-temporal bit in MemSDNode >> bit 7 : invariant bit in MemSDNode >> bit 8~11: memory order in AtomicSDNode >> bit 12 : synch scope in AtomicSDNode >> >> Considering the class hierarchy, we could safely reused bit 0~1 as the target flags in AtomicSDNode/MemIntrinsicNode >> >> + X86 backend is modified to generate additional XACQUIRE/XRELEASE prefix based on the specified target flag >> >> >> The following are details of each patch: >> >> * 0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch >> >> This patch adds 'targetflags' support in AtomicSDNode and MemIntrinsicSDNode. It will check metadata 'targetflags' and embedded its value into SubclassData. Currently, only two bits are defined. >> >> * 0002-Add-HLE-target-feature.patch >> >> This patch adds HLE feature and auto-detection support >> >> * 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch >> >> This patch adds XACQUIRE/XRELEASE prefix and its assembler/encoding support >> >> * 0004-Enable-HLE-code-generation.patch >> >> This patch enables HLE code generation by extending the current logic to handle 'targetflags'. >> >> * 0001-Add-target-flags-support-for-atomic-ops.patch >> >> This patch adds target flags support in __atomic_* builtins. It splits the whole 32-bit order word into high and low 16-bit parts. The low 16-bit is the original memory order and the high 16-bit will be re-defined as target-specific flags and passed through 'targetflags' metadata. >> >> * 0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch >> >> It adds '-m[no]hle' option to turn on HLE feature or not. Once HLE feature is turned on, two more macros (__ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE) are defined for developers to mark atomic builtins. >> >> Thanks for your time to review! >> >> Yours >> - Michael >> --- >> [1] http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01073.html >> [2] http://software.intel.com/sites/default/files/319433-014.pdf >> >> <0001-Add-targetflags-in-AtomicSDNode-MemIntrinsicSDNode.patch><0002-Add-HLE-target-feature.patch><0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch><0004-Enable-HLE-code-generation.patch><0001-Add-target-flags-support-for-atomic-ops.patch><0002-Add-mhle-option-support-and-populate-pre-defined-mac.patch>_______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits