Andrew Kelley via llvm-dev
2019-Feb-09 07:02 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
I'm interested in using @llvm.experimental.vector.reduce.smax/umax to implement runtime overflow checking for vectors. Here's an example checked addition, without vectors, and then I'll follow the example with what I would do for checked addition with vectors. Frontend code (zig): export fn entry() void { var a: i32 = 1; var b: i32 = 2; var x = a + b; } LLVM IR code: define void @entry() #2 !dbg !41 { Entry: %a = alloca i32, align 4 %b = alloca i32, align 4 %x = alloca i32, align 4 store i32 1, i32* %a, align 4, !dbg !52 call void @llvm.dbg.declare(metadata i32* %a, metadata !45, metadata !DIExpression()), !dbg !52 store i32 2, i32* %b, align 4, !dbg !53 call void @llvm.dbg.declare(metadata i32* %b, metadata !48, metadata !DIExpression()), !dbg !53 %0 = load i32, i32* %a, align 4, !dbg !54 %1 = load i32, i32* %b, align 4, !dbg !55 %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), !dbg !56 %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 OverflowFail: ; preds = %Entry tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* null), !dbg !56 unreachable, !dbg !56 OverflowOk: ; preds = %Entry store i32 %3, i32* %x, align 4, !dbg !57 call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata !DIExpression()), !dbg !57 ret void, !dbg !58 } You can see this takes advantage of @llvm.sadd.with.overflow, which is not available with vectors. So here is a different approach (pseudocode): %a_zext = zext %a to i33 # 1 more bit %b_zext = zext %b to i33 # 1 more bit %result_zext = add %a_zext, %b_zext %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) %overflow = icmp %max_result > @max_i32_value %result = trunc %result_zext to i32 You can imagine how this would work for signed integers, replacing zext with sext and umax with smax. This depends on an "experimental" API. Can anyone advise on depending on this API? Is it a bad idea? Is it about to be promoted to non-experimental soon? Can anyone advise on how to best achieve my goal? Kind regards, Andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/d04d8d80/attachment.sig>
Craig Topper via llvm-dev
2019-Feb-09 08:42 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
I don't think I understand your pseudocode using llvm.experimental.vector.reduce.umax. All of the types you showed are scalar, but that intrinsic doesn't work on scalars so I'm having a hard time understanding what you're trying to do with it. llvm.experimental.vector.reduce.umax takes a vector input and returns a scalar result. Are you wanting to find if any of the additions overflowed or a mask of which addition overflowed? The sadd.with.overflow intrinsics are in the process of gaining vector support if not already complete. Simon Pilgrim made some commits recently. I know the documentation in the LangRef hasn't been updated. It will return a <X x i1> vector for overflow instead i1 when vectors are used. ~Craig On Fri, Feb 8, 2019 at 11:03 PM Andrew Kelley via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I'm interested in using @llvm.experimental.vector.reduce.smax/umax to > implement runtime overflow checking for vectors. Here's an example > checked addition, without vectors, and then I'll follow the example with > what I would do for checked addition with vectors. > > Frontend code (zig): > > export fn entry() void { > var a: i32 = 1; > var b: i32 = 2; > var x = a + b; > } > > LLVM IR code: > > define void @entry() #2 !dbg !41 { > Entry: > %a = alloca i32, align 4 > %b = alloca i32, align 4 > %x = alloca i32, align 4 > store i32 1, i32* %a, align 4, !dbg !52 > call void @llvm.dbg.declare(metadata i32* %a, metadata !45, metadata > !DIExpression()), !dbg !52 > store i32 2, i32* %b, align 4, !dbg !53 > call void @llvm.dbg.declare(metadata i32* %b, metadata !48, metadata > !DIExpression()), !dbg !53 > %0 = load i32, i32* %a, align 4, !dbg !54 > %1 = load i32, i32* %b, align 4, !dbg !55 > %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), > !dbg !56 > %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 > %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 > br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 > > OverflowFail: ; preds = %Entry > tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* null), !dbg !56 > unreachable, !dbg !56 > > OverflowOk: ; preds = %Entry > store i32 %3, i32* %x, align 4, !dbg !57 > call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata > !DIExpression()), !dbg !57 > ret void, !dbg !58 > } > > You can see this takes advantage of @llvm.sadd.with.overflow, which is > not available with vectors. So here is a different approach (pseudocode): > > %a_zext = zext %a to i33 # 1 more bit > %b_zext = zext %b to i33 # 1 more bit > %result_zext = add %a_zext, %b_zext > %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) > %overflow = icmp %max_result > @max_i32_value > %result = trunc %result_zext to i32 > > You can imagine how this would work for signed integers, replacing zext > with sext and umax with smax. > > This depends on an "experimental" API. Can anyone advise on depending on > this API? Is it a bad idea? Is it about to be promoted to > non-experimental soon? Can anyone advise on how to best achieve my goal? > > Kind regards, > Andrew > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/8bc6875a/attachment.html>
Sanjay Patel via llvm-dev
2019-Feb-09 16:17 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
The IR update to allow vector types was here: https://reviews.llvm.org/D57090 ...we didn't update the docs at that time because it was not clear what the backend would do with that, but that might've changed with some of the more recent patches. On Sat, Feb 9, 2019 at 1:42 AM Craig Topper via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I don't think I understand your pseudocode using > llvm.experimental.vector.reduce.umax. All of the types you showed are > scalar, but that intrinsic doesn't work on scalars so I'm having a hard > time understanding what you're trying to do with it. > llvm.experimental.vector.reduce.umax takes a vector input and returns a > scalar result. Are you wanting to find if any of the additions overflowed > or a mask of which addition overflowed? > > The sadd.with.overflow intrinsics are in the process of gaining vector > support if not already complete. Simon Pilgrim made some commits recently. > I know the documentation in the LangRef hasn't been updated. It will return > a <X x i1> vector for overflow instead i1 when vectors are used. > > ~Craig > > > On Fri, Feb 8, 2019 at 11:03 PM Andrew Kelley via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I'm interested in using @llvm.experimental.vector.reduce.smax/umax to >> implement runtime overflow checking for vectors. Here's an example >> checked addition, without vectors, and then I'll follow the example with >> what I would do for checked addition with vectors. >> >> Frontend code (zig): >> >> export fn entry() void { >> var a: i32 = 1; >> var b: i32 = 2; >> var x = a + b; >> } >> >> LLVM IR code: >> >> define void @entry() #2 !dbg !41 { >> Entry: >> %a = alloca i32, align 4 >> %b = alloca i32, align 4 >> %x = alloca i32, align 4 >> store i32 1, i32* %a, align 4, !dbg !52 >> call void @llvm.dbg.declare(metadata i32* %a, metadata !45, metadata >> !DIExpression()), !dbg !52 >> store i32 2, i32* %b, align 4, !dbg !53 >> call void @llvm.dbg.declare(metadata i32* %b, metadata !48, metadata >> !DIExpression()), !dbg !53 >> %0 = load i32, i32* %a, align 4, !dbg !54 >> %1 = load i32, i32* %b, align 4, !dbg !55 >> %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), >> !dbg !56 >> %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 >> %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 >> br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 >> >> OverflowFail: ; preds = %Entry >> tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* null), !dbg !56 >> unreachable, !dbg !56 >> >> OverflowOk: ; preds = %Entry >> store i32 %3, i32* %x, align 4, !dbg !57 >> call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata >> !DIExpression()), !dbg !57 >> ret void, !dbg !58 >> } >> >> You can see this takes advantage of @llvm.sadd.with.overflow, which is >> not available with vectors. So here is a different approach (pseudocode): >> >> %a_zext = zext %a to i33 # 1 more bit >> %b_zext = zext %b to i33 # 1 more bit >> %result_zext = add %a_zext, %b_zext >> %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) >> %overflow = icmp %max_result > @max_i32_value >> %result = trunc %result_zext to i32 >> >> You can imagine how this would work for signed integers, replacing zext >> with sext and umax with smax. >> >> This depends on an "experimental" API. Can anyone advise on depending on >> this API? Is it a bad idea? Is it about to be promoted to >> non-experimental soon? Can anyone advise on how to best achieve my goal? >> >> Kind regards, >> Andrew >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/3181ffd5/attachment-0001.html>
Possibly Parallel Threads
- how experimental are the llvm.experimental.vector.reduce.* functions?
- how experimental are the llvm.experimental.vector.reduce.* functions?
- how experimental are the llvm.experimental.vector.reduce.* functions?
- can debug info for coroutines be improved?
- arbitrary bit number