Andrew Kelley via llvm-dev
2020-Jan-23 06:49 UTC
[llvm-dev] How to find out the default CPU / Features String for a given triple?
When I pass an empty string for cpu and features to createTargetMachine, and then use LLVMGetTargetMachineCPU() and LLVMGetTargetMachineFeatureString() to get the strings back, they are still empty. Is there a way to have llvm compute the effective cpu/features string, and provide it so that I can inspect it? I'm trying to figure out how the cpu/features string that I am explicitly passing, which I am expecting to be equivalent to passing empty string, is different. As an example, I have a test passing in the CI using the triple "aarch64v8.1a-unknown-linux-unknown", and "" for both CPU name and target features string. However when I pass the following target specific features string, I get qemu crashing in the CI: -a35,-a53,-a55,-a57,-a72,-a73,-a75,-a76,-aes,-aggressive-fma,-alternate-sextload-cvt-f32-pattern,+altnzcv,+am,-arith-bcc-fusion,-arith-cbz-fusion,-balance-fp-ops,+bti,-call-saved-x10,-call-saved-x11,-call-saved-x12,-call-saved-x13,-call-saved-x14,-call-saved-x15,-call-saved-x18,-call-saved-x8,-call-saved-x9,+ccdp,+ccidx,+ccpp,+complxnum,+crc,-crypto,-custom-cheap-as-move,-cyclone,-disable-latency-sched-heuristic,+dit,+dotprod,-exynos-cheap-as-move,-exynosm1,-exynosm2,-exynosm3,-exynosm4,-falkor,+fmi,-force-32bit-jump-tables,+fp-armv8,-fp16fml,+fptoint,-fullfp16,-fuse-address,+fuse-aes,-fuse-arith-logic,-fuse-crypto-eor,-fuse-csel,-fuse-literals,+jsconv,-kryo,+lor,+lse,-lsl-fast,+mpam,-mte,+neon,-no-neg-immediates,+nv,+pa,+pan,+pan-rwv,+perfmon,-predictable-select-expensive,+predres,-rand,+ras,+rasv8_4,+rcpc,+rcpc-immo,+rdm,-reserve-x1,-reserve-x10,-reserve-x11,-reserve-x12,-reserve-x13,-reserve-x14,-reserve-x15,-reserve-x18,-reserve-x2,-reserve-x20,-reserve-x21,-reserve-x22,-reserve-x23,-reserve-x24,-reserve-x25,-reserve-x26,-reserve-x27,-reserve-x28,-reserve-x3,-reserve-x4,-reserve-x5,-reserve-x6,-reserve-x7,-reserve-x9,-saphira,+sb,+sel2,-sha2,-sha3,-slow-misaligned-128store,-slow-paired-128,-slow-strqro-store,-sm4,-spe,+specrestrict,+ssbs,-strict-align,-sve,-sve2,-sve2-aes,-sve2-bitperm,-sve2-sha3,-sve2-sm4,-thunderx,-thunderx2t99,-thunderxt81,-thunderxt83,-thunderxt88,+tlb-rmi,-tpidr-el1,-tpidr-el2,-tpidr-el3,+tracev8.4,-tsv110,+uaops,-use-aa,+use-postra-scheduler,-use-reciprocal-square-root,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+vh,-zcm,-zcz,-zcz-fp,-zcz-fp-workaround,-zcz-gp Now, I understand that qemu crashing can be fixed by using a newer qemu version. And, indeed, on my laptop with a newer qemu, the test passes. However, this test was previously passing on the CI, which means that with this explicit string of target specific features, *something is different*. I am trying to determine what that different thing is. Thanks in advance for the help, Andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200123/899e818a/attachment.sig>
David Spickett via llvm-dev
2020-Jan-23 10:04 UTC
[llvm-dev] How to find out the default CPU / Features String for a given triple?
Hi Andrew, What tools are you passing this triple to, and where did you get the list of features you showed? (looks like an MC level list to me) Depending on the tool you'll get different levels of expansion of the features, "clang -###" is the most minimal then some things are expanded later in llvm-mc. (where is somewhat inconsistent) David Spickett. ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org> Sent: 23 January 2020 06:49 To: LLVM Dev <llvm-dev at lists.llvm.org> Subject: [llvm-dev] How to find out the default CPU / Features String for a given triple? When I pass an empty string for cpu and features to createTargetMachine, and then use LLVMGetTargetMachineCPU() and LLVMGetTargetMachineFeatureString() to get the strings back, they are still empty. Is there a way to have llvm compute the effective cpu/features string, and provide it so that I can inspect it? I'm trying to figure out how the cpu/features string that I am explicitly passing, which I am expecting to be equivalent to passing empty string, is different. As an example, I have a test passing in the CI using the triple "aarch64v8.1a-unknown-linux-unknown", and "" for both CPU name and target features string. However when I pass the following target specific features string, I get qemu crashing in the CI: -a35,-a53,-a55,-a57,-a72,-a73,-a75,-a76,-aes,-aggressive-fma,-alternate-sextload-cvt-f32-pattern,+altnzcv,+am,-arith-bcc-fusion,-arith-cbz-fusion,-balance-fp-ops,+bti,-call-saved-x10,-call-saved-x11,-call-saved-x12,-call-saved-x13,-call-saved-x14,-call-saved-x15,-call-saved-x18,-call-saved-x8,-call-saved-x9,+ccdp,+ccidx,+ccpp,+complxnum,+crc,-crypto,-custom-cheap-as-move,-cyclone,-disable-latency-sched-heuristic,+dit,+dotprod,-exynos-cheap-as-move,-exynosm1,-exynosm2,-exynosm3,-exynosm4,-falkor,+fmi,-force-32bit-jump-tables,+fp-armv8,-fp16fml,+fptoint,-fullfp16,-fuse-address,+fuse-aes,-fuse-arith-logic,-fuse-crypto-eor,-fuse-csel,-fuse-literals,+jsconv,-kryo,+lor,+lse,-lsl-fast,+mpam,-mte,+neon,-no-neg-immediates,+nv,+pa,+pan,+pan-rwv,+perfmon,-predictable-select-expensive,+predres,-rand,+ras,+rasv8_4,+rcpc,+rcpc-immo,+rdm,-reserve-x1,-reserve-x10,-reserve-x11,-reserve-x12,-reserve-x13,-reserve-x14,-reserve-x15,-reserve-x18,-reserve-x2,-reserve-x20,-reserve-x21,-reserve-x22,-reserve-x23,-reserve-x24,-reserve-x25,-reserve-x26,-reserve-x27,-reserve-x28,-reserve-x3,-reserve-x4,-reserve-x5,-reserve-x6,-reserve-x7,-reserve-x9,-saphira,+sb,+sel2,-sha2,-sha3,-slow-misaligned-128store,-slow-paired-128,-slow-strqro-store,-sm4,-spe,+specrestrict,+ssbs,-strict-align,-sve,-sve2,-sve2-aes,-sve2-bitperm,-sve2-sha3,-sve2-sm4,-thunderx,-thunderx2t99,-thunderxt81,-thunderxt83,-thunderxt88,+tlb-rmi,-tpidr-el1,-tpidr-el2,-tpidr-el3,+tracev8.4,-tsv110,+uaops,-use-aa,+use-postra-scheduler,-use-reciprocal-square-root,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+vh,-zcm,-zcz,-zcz-fp,-zcz-fp-workaround,-zcz-gp Now, I understand that qemu crashing can be fixed by using a newer qemu version. And, indeed, on my laptop with a newer qemu, the test passes. However, this test was previously passing on the CI, which means that with this explicit string of target specific features, *something is different*. I am trying to determine what that different thing is. Thanks in advance for the help, Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200123/14e30846/attachment.html>
Renato Golin via llvm-dev
2020-Jan-23 10:34 UTC
[llvm-dev] How to find out the default CPU / Features String for a given triple?
On Thu, 23 Jan 2020 at 06:49, Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org> wrote:> When I pass an empty string for cpu and features to createTargetMachine, > and then use LLVMGetTargetMachineCPU() and > LLVMGetTargetMachineFeatureString() to get the strings back, they are > still empty. Is there a way to have llvm compute the effective > cpu/features string, and provide it so that I can inspect it?Hi Andrew, These are the supported features for AArch64: https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/AArch64TargetParser.def> As an example, I have a test passing in the CI using the triple > "aarch64v8.1a-unknown-linux-unknown", and "" for both CPU name and > target features string."aarch64v8.1a" doesn't exist. Target and Arch are two different things. The target is for example "aarch64-linux-gnu" while arch (not CPU) is "armv8.1-a" (not "v8.1a"). CPU are their literal names, like "cortex-a53". IIRC, we allow adding modifiers like "+sve" to CPU for compatibility, but they should be used when defining the arch, not CPU. Unfortunately, we don't have control over the triples and arches and have to support what vendors and prior tools do.> However when I pass the following target > specific features string, I get qemu crashing in the CI:That error message has a lot more than what's accepted for target/arch/cpu/flags. There are a lot of target specific flags in there, which I'm not even sure they would parse in a target string at all. Unfortunatelly, without knowing how QEMU is using that or where the error comes from, it's hard to tell what that means.> Now, I understand that qemu crashing can be fixed by using a newer qemu > version. And, indeed, on my laptop with a newer qemu, the test passes. > However, this test was previously passing on the CI, which means that > with this explicit string of target specific features, *something is > different*. I am trying to determine what that different thing is.That string is invalid. It's possible either QEMU or LLVM supported it by either error or legacy and that's no longer true? I'm surprised a newer QEMU somehow understands that, but without understanding your infrastructure, what the test is and how it's built, where does LLVM fit into all that, etc. it's hard to have any idea what's going on. cheers, --renato
Andrew Kelley via llvm-dev
2020-Jan-23 16:22 UTC
[llvm-dev] How to find out the default CPU / Features String for a given triple?
On 1/23/20 5:04 AM, David Spickett wrote:> Hi Andrew, > > What tools are you passing this triple to, and where did you get the > list of features you showed? (looks like an MC level list to me)include/llvm/Support/TargetRegistry.h /// createTargetMachine - Create a target specific machine implementation /// for the specified \p Triple. /// /// \param TT This argument is used to determine the target machine /// feature set; it should always be provided. Generally this should be /// either the target triple from the module, or the target triple of the /// host if that does not exist. TargetMachine *createTargetMachine(StringRef TT, StringRef CPU, StringRef Features, const TargetOptions &Options, Optional<Reloc::Model> RM, Optional<CodeModel::Model> CM = None, CodeGenOpt::Level OL = CodeGenOpt::Default, bool JIT = false) const; The list of features I showed comes from LLVM's lib/Target/AArch64/AArch64.td file, processed into this file: https://github.com/ziglang/zig/blob/c86589a738e5053a26b2be7d156bcfee1565f00b/lib/std/target/aarch64.zig Next, the "generic" CPU is chosen, the feature corresponding to the sub-arch (v8_5a) is added to the set given by the CPU, and then the set is recursively populated with feature dependencies. I'm trying to find out how to check my work here, and make sure the list of features matches what LLVM chooses when empty strings are passed for CPU and Features. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200123/196c20da/attachment.sig>