Gleb Popov via llvm-dev
2020-Jan-03 18:00 UTC
[llvm-dev] How do I teach codegen to handle i8 arrays as i32s?
Hello. My backend supports only i32 stores. Given the following IR: %2 = alloca [2 x i8], align 1 %4 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 0 store i8 1, i8* %4, align 1 %5 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 1 store i8 2, i8* %5, align 1 is it possible to convince LLVM codegen to first load 4 bytes, then blend it with the value being stored using "or" and then store 4 bytes back? Or maybe it should be performed on the IR level? Thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200103/a1f0b7a0/attachment.html>
Matt Arsenault via llvm-dev
2020-Jan-03 18:42 UTC
[llvm-dev] How do I teach codegen to handle i8 arrays as i32s?
> On Jan 3, 2020, at 13:00, Gleb Popov via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello. > > My backend supports only i32 stores. Given the following IR: > > %2 = alloca [2 x i8], align 1 > > %4 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 0 > store i8 1, i8* %4, align 1 > %5 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 1 > store i8 2, i8* %5, align 1 > > is it possible to convince LLVM codegen to first load 4 bytes, then blend it with the value being stored using "or" and then store 4 bytes back? >Yes, you can accomplish this with custom lowering for the store -Matt
Doerfert, Johannes via llvm-dev
2020-Jan-03 18:48 UTC
[llvm-dev] How do I teach codegen to handle i8 arrays as i32s?
On 01/03, Gleb Popov via llvm-dev wrote:> My backend supports only i32 stores. Given the following IR: > > %2 = alloca [2 x i8], align 1 > > %4 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 0 > store i8 1, i8* %4, align 1 > %5 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 1 > store i8 2, i8* %5, align 1 > > is it possible to convince LLVM codegen to first load 4 bytes, then blend > it with the value being stored using "or" and then store 4 bytes back? > > Or maybe it should be performed on the IR level?In the example as shown that is on it's own not necessarily legal (on the IR-level). If you expand the allocation to 4 x i8 it should be allowed, or if you do it late enough (=backend) you can make it work without introducing UB. I personally would do it on IR, seems easy enough. I think we have similar rules in InstCombine already. Cheers, Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200103/c23d15ba/attachment.sig>