Serge Pavlov via llvm-dev
2021-Mar-12 13:02 UTC
[llvm-dev] [RFC] Support of non-default floating point environment on RISC-V
Hi all, I am interested in the support of non-default FP environment on RISC-V. It requires some severe changes to the way the FP instructions are described now, so it is important to collect opinions and concerns on this topic. Although the discussion is about RISC-V, much of the material here is relevant to any target that needs to support a non-default FP environment. What is wrong with FP support now? Most floating point instructions can set accrued exception bits in `fflags` register to signal about some exceptional events, like overflow, invalid operation and so on. Instructions with dynamic rounding mode also depend on the content of the `frm` register. Now RISC-V FP instructions are specified so that they completely ignore these dependencies. Such implementation is suitable for default FP environment only ( https://llvm.org/docs/LangRef.html#floating-point-environment). When using it in a non-default FP environment, incorrect code may be produced. For example, in the following code: ``` csrwi frm, a1 fadd.d ft2, ft2, ft3 ``` compiler may change the order of instructions, which results in incorrect behavior. Although `fadd.d` depends on the value of `frm`, this fact is not presented in the properties of FP instructions. Similarly, the code: ``` fadd.d ft2, ft2, ft3 csrrs t0, fcsr, zero ``` does not allow changing the order of the instructions, as `crsrs` reads content of `fflags`, which is set by the first instruction. But the compiler doesn't know about this dependency. How to solve this problem Description of the FP instructions should be modified so that dependencies with `fflags` and `frm` would be present in the instruction descriptions. Both these registers are not specified in the instructions, these are implicit dependencies. Usually they are added to properties `Uses` and `Defs` of an `Instruction`. RISC-V allows static rounding mode, which is taken from instruction bits rather than from `frm`. It means that any instruction that can depend on rounding mode exists in two variants: 1. sets `fflags`, depends on `frm` (dynamic rounding mode), 2. sets `fflags`, does not depend on `frm` (static rounding mode). Such a set of instructions precisely represents hardware, but is not suitable for the default FP environment. Changes of `fflags` are ignored in this mode, so dependencies on `fflags` creates useless output dependencies that prevent optimal scheduling. As the default FP environment is the most important use case, these variants should also be considered: 1. changes of `fflags` is ignored, does not depend on `frm` (default FP environment). 2. changes of `fflags` is ignored, depends on `frm`. So, there can be 4 variants of each FP instruction, probably it is too many. Variant 1 must be supported, it is the most general case in sense of restrictions. Variant 3 also is mandatory, as it represents the default FP environment. Variants 2 and 4 may be omitted but some optimization opportunities would be lost. Lowering of instruction in default FP environment Instructions like `fadd`, which are used in default FP environment, may be lowered in a couple of ways: - to the instruction that uses static rounding mode RNE, or - to the instruction that uses dynamic rounding mode. In this case `frm` must contain RNE. The case of static rounding mode has some advantages: - It does not require synchronization of `frm` when FP environment is changed to default, - The code that uses only static rounding mode may be safely called from any code that uses different rounding mode, - Instructions with static rounding may be moved freely just as any other instructions, - It simplifies implementation of things like `#pragma STDC FENV_ROUND`. An issue is possible in this case. A code can set a non-default rounding mode by a call to `fesetround`, the subsequent instructions would be executed with the new rounding mode. As `fesetround` usually is an external function, the call instruction serves as a barrier, preventing undesired moves. In the case when `#pragma STDC FENV_ACCESS` is unsupported it is an acceptable solution. If such code is ported to RISC-V it would fail, if instructions would use static rounding. As a temporary solution the compiler should lower instructions in default FP environment to variants with dynamic rounding mode. It should decrease the risk of failure. When constrained intrinsics will be implemented for RISC-V, the lowering can be changed to use static rounding. Are there any things that should also be considered? How many instruction variants should be supported (2, 3, 4)? Any feedback is appreciated. Thanks, --Serge -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/05b15524/attachment-0001.html>