Iain Sandoe
2014-Sep-05 00:39 UTC
[LLVMdev] BUILD_PAIR and EXTRACT_ELEMENT and endianness.
The descriptions of these two operations: EXTRACT_ELEMENT - This is used to get the lower or upper (determined by a Constant, which is required to be operand #1) half of the integer or float value specified as operand #0. This is only for use before legalization, for values that will be broken into multiple registers. BUILD_PAIR - This is the opposite of EXTRACT_ELEMENT in some ways. Given two values of the same integer value type, this produces a value twice as big. Like EXTRACT_ELEMENT, this can only be used before legalization. This is ambiguous about whether the halves are in terms of significance or memory order. This does not matter for LE machines, but it does for BE. ===== For PPC, we have the following sequence occurring. A=<fp64 MS> : B=<fp64 LS> <= function arguments significance order determined by the ABI. these are then BUILD_PAIR with Lo = B and Hi = A. (swapped, indicating that we expect BUILD_PAIR to consider Lo/Hi as significance, rather than memory order) the result is then bitcast to i128. Now, if that i128 is placed in memory it will have incorrect ordering unless either BUILD_PAIR or the bitcast understand the endianness of the target (which doesn't _appear_ to be the case). we then see two EXTRACT_ELEMENT ops (which seem to be assuming that the expected ordering is memory)... (the results of these are also swapped and then bit cast to fp64s) these are passed to another routine. In this case there are two swaps that cancel out and the call part of the ABI is not broken - however the objects are broken in memory (the i128 happens to be stored on the stack) with the two halves in the wrong memory order. ===== I suppose that first we need to have a clear definition of the meaning of Lo and Hi in the BUILD_PAIR and the expectations of 0 and 1 in the EXTRACT_ELEMENT. .. then we can work through the SelectionDAG builder code and try to figure out where the extra swap is being introduced. any insight would be most appreciated. cheers Iain P.S. example problem code at : https://gist.github.com/iains/b306d3a02b6581fd7ca6