Dmitry Ponyatov via llvm-dev
2018-Jun-01 07:01 UTC
[llvm-dev] endianness and bit fields in struct{}
Good Day Is any work in the LLVM development community are pointed to expanding IR model to support bit fields and arbitrary endianness manipulations? I'm looking on LLVM as a backend for data language compilator. I'm working with IoT (Lora) and has a problem with data format specification for a wide variety of end-node devices (M2M/SmartHome). Due to the fact that the size of the package is small (50..200 bytes), it is only permissible to apply maximally compact binary data exchange protocols. As an example, here some sample in C: #include <stdint.h> #include <stdbool.h> struct __attribute__((packed)) { uint8_t A; __attribute__((scalar_storage_order("big"))) int16_t B; float C; unsigned int D:3; signed int E:7; bool F; } Packet; void encode() { Packet.A=1; Packet.B=2; // integers Packet.C=3; // float Packet.E=4; Packet.E=-5; // bit field integers Packet.F = true; // boolean } The problem is this definition must be really platform independent: the same code must work identically on Cortex-M, ARM9 and MIPS, and north side (x86_64 primarily). So I need attributes supported by GCC6: __attribute__((packed)) for structures and __attribute__((big/littleendian)) for struct fields. As a variant, I'm going to write some special DDL scheme compiler alike ASN.1 compilers (with a human-readable syntax for 1C programmers). Looking on code generated by clang, it looks like LLVM IR not only has no support for endianness attributes but also has pure support for bit fields manipulations widely used in most embedded software (and some MCU and FPGA-synthesized cores have bit field operations in hardware). As an example, generated IR uses in-compiled and/or bitmasking should be replaced with special bit-fields operations with ui3, si7 arbitrary length integers. ; ModuleID = 'bitstruct.c' target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-linux-gnu" %struct.anon = type <{ i8, i16, float, i16, i8 }> @Packet = common global %struct.anon zeroinitializer, align 1 ; Function Attrs: nounwind uwtable define void @encode() #0 { store i8 1, i8* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 0), align 1 store i16 2, i16* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 1), align 1 store float 3.000000e+00, float* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 2), align 1 %1 = load i16, i16* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 3), align 1 %2 = and i16 %1, -1017 %3 = or i16 %2, 32 store i16 %3, i16* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 3), align 1 %4 = load i16, i16* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 3), align 1 %5 = and i16 %4, -1017 %6 = or i16 %5, 984 store i16 %6, i16* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 3), align 1 store i8 1, i8* getelementptr inbounds (%struct.anon, %struct.anon* @Packet, i32 0, i32 4), align 1 ret void } !0 = !{!"clang version 3.8.1-24 (tags/RELEASE_381/final)"} ------------------------------ With best regards, Dmitry Ponyatov, Icbcom, IoT/embedded engineer, tel. +7 917 10 10 818 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180601/38be8953/attachment.html>
Tim Northover via llvm-dev
2018-Jun-01 10:03 UTC
[llvm-dev] endianness and bit fields in struct{}
Hi Dmitry, On 1 June 2018 at 08:01, Dmitry Ponyatov via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Is any work in the LLVM development community are pointed to expanding IR > model to support bit fields and arbitrary endianness manipulations?Not as far as I'm aware. If anything the reverse is happening and we're simplifying the type system. We once had union types but they were removed years ago due to bitrotting, and there's an ongoing effort to move to just a single pointer type.> Looking on code generated by clang, it looks like LLVM IR not only has no support for endianness attributesYes, this would be handled by the @llvm.bswap.* intrinsics, which can be trivially folded into loads & stores if instructions are available. LLVM IR is a load/store architecture and endianness attributes really don't justify themselves in that regime.> As an example, generated IR uses in-compiled and/or bitmasking should be replaced with special bit-fields operations with ui3, si7 arbitrary length integers.We certainly *could* do that, but IMO the motivation is pretty weak there too. Firstly, backends tend to produce better code for natively sized types, or powers of 2 at the worst. Others should work, but they're not what anyone optimizes for (which leads to a bit of a vicious circle where front-ends avoid using them, for better or worse). Also, the existing instructions really do represent what most CPUs are going to do to manipulate bitfields. Hiding that behind special instructions just means they're going to be expanded later and complicates the cost-models on average. Targets that do have special bitfield instructions (AArch64 for example) can pattern-match those manipulations without too much difficulty to make use of them. Cheers. Tim.