thr3ads.net - llvm dev - [llvm-dev] How to change CLang struct alignment behaviour? [May 2019]

If this information is useful, please help other people find it:
Share via:

Joan Lluch via llvm-dev

2019-May-13 09:17 UTC

[llvm-dev] How to change CLang struct alignment behaviour?

I had already adjusted MaxStoresPerMemcpy to my preferred value, and this works
great but for the cases where load/stores are used on non-size aligned structs
the odd behaviour still happens. For my 3-char, 3-byte struct test, the memcpy
replacement appears to consist on a single byte load and store of the last char
(this is correct), followed by a 16 bit move of the first two chars, this is
also correct, but the odd thing is that the 16 bit move is performed by picking
bytes separately from the source struct, then combining them into 16 bit values
by means of shifts, swap, and or, then moved as words to the destination. I
don’t know why simple 16 bit load and stores are used instead. Will try to track
findOptimalMemOpLowering with the debugger to see if I see some light. So thanks
for pointing that out to me.

John

Tel: 620 28 45 13
> On 13 May 2019, at 09:33, Tim Northover <t.p.northover at gmail.com>
wrote:
> 
> Hi Joan,
> 
> On Mon, 13 May 2019 at 07:53, Joan Lluch <joan.lluch at icloud.com>
wrote:
>> The reason I want structs to be aligned/padded to 2 bytes is because my
architecture only has 16 bit operations. I can read (sign and zero extended) and
write (truncated) 8 bit data from/to memory, but all intermediate operations in
registers are performed in 16 bit registers.
> 
> This is very normal. Mostly it's at 32-bits rather than 16, but it
> applies to basically every RISC architecture so LLVM should handle it
> well without adjusting the alignment requirements of types.
> 
>> This causes LLVM to generate odd tricks such as shifts and byte-swaps,
when trying to replace struct ‘memcpy’s by word sized load/store instructions.
> 
> That sounds odd, as if you've not taught the backend to use those
> 8-bit loads and stores so it's trying to emulate them with word-sized
> ones (early Alpha chips genuinely didn't have byte access so had to do
> that kind of thing). You can (and should) probably fix that.
> 
> Also, there are a few customization points where you can control how
> memcpy is implemented. The function "findOptimalMemOpLowering"
lets
> you control the type used for the loads and stores, and
> MaxStoresPerMemcpy controls when LLVM will call the real memcpy. If
> you want even more control you can implement EmitTargetCodeForMemcpy
> to do the whole thing.
> 
> Cheers.
> 
> Tim.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/17263825/attachment.html>

Joan Lluch via llvm-dev

2019-May-13 17:01 UTC

head link

[llvm-dev] How to change CLang struct alignment behaviour?

Hi Tim,

After looking at it a bit further, I think this is a Clang thing.  Clang issues
“align 2” if the struct has at least one int (2 bytes), but also if the entire
struct size is multiple of 2. For example a struct with 4 char members. In these
cases the LLVM backend correctly creates word sized load/stores (2 bytes).

However, Clang issues “align 1” for the remaining cases, for example for 3 byte
structs with 3 char members. The LLVM backend just follows what’s dictated by
Clang regarding alignment and thus it creates 2 byte or 1 byte load/stores
instructions accordingly. I have not found a way to override this in LLVM. Any
suggestions are appreciated.

I have also posted this problem in the cfe-def list.

John Lluch

Tel: 620 28 45 13
> On 13 May 2019, at 11:17, Joan Lluch <joan.lluch at icloud.com>
wrote:
> 
> I had already adjusted MaxStoresPerMemcpy to my preferred value, and this
works great but for the cases where load/stores are used on non-size aligned
structs the odd behaviour still happens. For my 3-char, 3-byte struct test, the
memcpy replacement appears to consist on a single byte load and store of the
last char (this is correct), followed by a 16 bit move of the first two chars,
this is also correct, but the odd thing is that the 16 bit move is performed by
picking bytes separately from the source struct, then combining them into 16 bit
values by means of shifts, swap, and or, then moved as words to the destination.
I don’t know why simple 16 bit load and stores are used instead. Will try to
track findOptimalMemOpLowering with the debugger to see if I see some light. So
thanks for pointing that out to me.
> 
> John
> 
> Tel: 620 28 45 13
> 
>> On 13 May 2019, at 09:33, Tim Northover <t.p.northover at gmail.com
<mailto:t.p.northover at gmail.com>> wrote:
>> 
>> Hi Joan,
>> 
>> On Mon, 13 May 2019 at 07:53, Joan Lluch <joan.lluch at icloud.com
<mailto:joan.lluch at icloud.com>> wrote:
>>> The reason I want structs to be aligned/padded to 2 bytes is
because my architecture only has 16 bit operations. I can read (sign and zero
extended) and write (truncated) 8 bit data from/to memory, but all intermediate
operations in registers are performed in 16 bit registers.
>> 
>> This is very normal. Mostly it's at 32-bits rather than 16, but it
>> applies to basically every RISC architecture so LLVM should handle it
>> well without adjusting the alignment requirements of types.
>> 
>>> This causes LLVM to generate odd tricks such as shifts and
byte-swaps, when trying to replace struct ‘memcpy’s by word sized load/store
instructions.
>> 
>> That sounds odd, as if you've not taught the backend to use those
>> 8-bit loads and stores so it's trying to emulate them with
word-sized
>> ones (early Alpha chips genuinely didn't have byte access so had to
do
>> that kind of thing). You can (and should) probably fix that.
>> 
>> Also, there are a few customization points where you can control how
>> memcpy is implemented. The function
"findOptimalMemOpLowering" lets
>> you control the type used for the loads and stores, and
>> MaxStoresPerMemcpy controls when LLVM will call the real memcpy. If
>> you want even more control you can implement EmitTargetCodeForMemcpy
>> to do the whole thing.
>> 
>> Cheers.
>> 
>> Tim.
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/1cf10ed7/attachment.html>

Tim Northover via llvm-dev

2019-May-13 18:09 UTC

head link

[llvm-dev] How to change CLang struct alignment behaviour?

Hi Joan,

On Mon, 13 May 2019 at 18:01, Joan Lluch <joan.lluch at icloud.com>
wrote:> After looking at it a bit further, I think this is a Clang thing.  Clang
issues “align 2” if the struct has at least one int (2 bytes), but also if the
entire struct size is multiple of 2. For example a struct with 4 char members.
In these cases the LLVM backend correctly creates word sized load/stores (2
bytes).
I'm slightly surprised that it happens based purely on size, but
either way LLVM should be able to cope.
> The LLVM backend just follows what’s dictated by Clang regarding alignment
and thus it creates 2 byte or 1 byte load/stores instructions accordingly. I
have not found a way to override this in LLVM. Any suggestions are appreciated.
That sounds right, but I don't think it explains the shifts you
described before. It should work out a lot better than what you're
seeing. Specifically, a 3 byte struct (for example) ought to either
lower to:

    load i16, load i8 + stores if your target can do misaligned i16 operations.

or

    load i8, load i8, load i8 + stores if not.

Neither of those involve shifting operations. I'd suggest breaking
just after getMemcpyLoadsAndStores and using SelectionDAG::dump to see
exactly what it's created. Then try to work out where that gets
pessimized to shifts, because it's not normal.

Cheers.

Tim.

Tim Northover via llvm-dev

2019-May-13 18:11 UTC

head link

[llvm-dev] How to change CLang struct alignment behaviour?

On Mon, 13 May 2019 at 18:01, Joan Lluch <joan.lluch at icloud.com>
wrote:> I have also posted this problem in the cfe-def list.
Also: I'm still pretty certain that "solving" this in Clang is the
wrong approach.

Cheers.

Tim.

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - May 2019 - How to change CLang struct alignment behaviour?

[llvm-dev] How to change CLang struct alignment behaviour?

[llvm-dev] How to change CLang struct alignment behaviour?

[llvm-dev] How to change CLang struct alignment behaviour?

[llvm-dev] How to change CLang struct alignment behaviour?

Maybe Matching Threads