Sanjay Patel via llvm-dev
2019-Feb-09 16:17 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
The IR update to allow vector types was here: https://reviews.llvm.org/D57090 ...we didn't update the docs at that time because it was not clear what the backend would do with that, but that might've changed with some of the more recent patches. On Sat, Feb 9, 2019 at 1:42 AM Craig Topper via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I don't think I understand your pseudocode using > llvm.experimental.vector.reduce.umax. All of the types you showed are > scalar, but that intrinsic doesn't work on scalars so I'm having a hard > time understanding what you're trying to do with it. > llvm.experimental.vector.reduce.umax takes a vector input and returns a > scalar result. Are you wanting to find if any of the additions overflowed > or a mask of which addition overflowed? > > The sadd.with.overflow intrinsics are in the process of gaining vector > support if not already complete. Simon Pilgrim made some commits recently. > I know the documentation in the LangRef hasn't been updated. It will return > a <X x i1> vector for overflow instead i1 when vectors are used. > > ~Craig > > > On Fri, Feb 8, 2019 at 11:03 PM Andrew Kelley via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I'm interested in using @llvm.experimental.vector.reduce.smax/umax to >> implement runtime overflow checking for vectors. Here's an example >> checked addition, without vectors, and then I'll follow the example with >> what I would do for checked addition with vectors. >> >> Frontend code (zig): >> >> export fn entry() void { >> var a: i32 = 1; >> var b: i32 = 2; >> var x = a + b; >> } >> >> LLVM IR code: >> >> define void @entry() #2 !dbg !41 { >> Entry: >> %a = alloca i32, align 4 >> %b = alloca i32, align 4 >> %x = alloca i32, align 4 >> store i32 1, i32* %a, align 4, !dbg !52 >> call void @llvm.dbg.declare(metadata i32* %a, metadata !45, metadata >> !DIExpression()), !dbg !52 >> store i32 2, i32* %b, align 4, !dbg !53 >> call void @llvm.dbg.declare(metadata i32* %b, metadata !48, metadata >> !DIExpression()), !dbg !53 >> %0 = load i32, i32* %a, align 4, !dbg !54 >> %1 = load i32, i32* %b, align 4, !dbg !55 >> %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), >> !dbg !56 >> %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 >> %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 >> br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 >> >> OverflowFail: ; preds = %Entry >> tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* null), !dbg !56 >> unreachable, !dbg !56 >> >> OverflowOk: ; preds = %Entry >> store i32 %3, i32* %x, align 4, !dbg !57 >> call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata >> !DIExpression()), !dbg !57 >> ret void, !dbg !58 >> } >> >> You can see this takes advantage of @llvm.sadd.with.overflow, which is >> not available with vectors. So here is a different approach (pseudocode): >> >> %a_zext = zext %a to i33 # 1 more bit >> %b_zext = zext %b to i33 # 1 more bit >> %result_zext = add %a_zext, %b_zext >> %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) >> %overflow = icmp %max_result > @max_i32_value >> %result = trunc %result_zext to i32 >> >> You can imagine how this would work for signed integers, replacing zext >> with sext and umax with smax. >> >> This depends on an "experimental" API. Can anyone advise on depending on >> this API? Is it a bad idea? Is it about to be promoted to >> non-experimental soon? Can anyone advise on how to best achieve my goal? >> >> Kind regards, >> Andrew >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/3181ffd5/attachment-0001.html>
Simon Pilgrim via llvm-dev
2019-Feb-09 17:25 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
The add/sub (+mul) overflow intrinsics are being updated to support vectors to match the related add/sub saturation intrinsics. We haven't updated the docs yet as legalization, vectorization and various minor bits of plumbing still need to be finished before it can be officially supported (Nikita Popov has been looking at the legalization recently). Regarding the reduction functions - I think the integer intrinsics at least are relatively stable and we can probably investigate dropping the experimental tag before the next release (assuming someone has the time to take on the work) - it'd be nice to have the SLP vectorizer emit reduction intrinsics directly for these. The floating point intrinsics are trickier as they (may) have stricter ordering constraints that is still causing issues and may need tweaking (e.g. see PR36734). On 09/02/2019 16:17, Sanjay Patel wrote:> The IR update to allow vector types was here: > https://reviews.llvm.org/D57090 > ...we didn't update the docs at that time because it was not clear > what the backend would do with that, but that might've changed with > some of the more recent patches. > > On Sat, Feb 9, 2019 at 1:42 AM Craig Topper via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > I don't think I understand your pseudocode using > llvm.experimental.vector.reduce.umax. All of the types you showed > are scalar, but that intrinsic doesn't work on scalars so I'm > having a hard time understanding what you're trying to do with it. > llvm.experimental.vector.reduce.umax takes a vector input and > returns a scalar result. Are you wanting to find if any of the > additions overflowed or a mask of which addition overflowed? > > The sadd.with.overflow intrinsics are in the process of gaining > vector support if not already complete. Simon Pilgrim made some > commits recently. I know the documentation in the LangRef hasn't > been updated. It will return a <X x i1> vector for overflow > instead i1 when vectors are used. > > ~Craig > > > On Fri, Feb 8, 2019 at 11:03 PM Andrew Kelley via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > I'm interested in using > @llvm.experimental.vector.reduce.smax/umax to > implement runtime overflow checking for vectors. Here's an example > checked addition, without vectors, and then I'll follow the > example with > what I would do for checked addition with vectors. > > Frontend code (zig): > > export fn entry() void { > var a: i32 = 1; > var b: i32 = 2; > var x = a + b; > } > > LLVM IR code: > > define void @entry() #2 !dbg !41 { > Entry: > %a = alloca i32, align 4 > %b = alloca i32, align 4 > %x = alloca i32, align 4 > store i32 1, i32* %a, align 4, !dbg !52 > call void @llvm.dbg.declare(metadata i32* %a, metadata !45, > metadata > !DIExpression()), !dbg !52 > store i32 2, i32* %b, align 4, !dbg !53 > call void @llvm.dbg.declare(metadata i32* %b, metadata !48, > metadata > !DIExpression()), !dbg !53 > %0 = load i32, i32* %a, align 4, !dbg !54 > %1 = load i32, i32* %b, align 4, !dbg !55 > %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, > i32 %1), > !dbg !56 > %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 > %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 > br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 > > OverflowFail: ; preds = %Entry > tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* > null), !dbg !56 > unreachable, !dbg !56 > > OverflowOk: ; preds = %Entry > store i32 %3, i32* %x, align 4, !dbg !57 > call void @llvm.dbg.declare(metadata i32* %x, metadata !50, > metadata > !DIExpression()), !dbg !57 > ret void, !dbg !58 > } > > You can see this takes advantage of @llvm.sadd.with.overflow, > which is > not available with vectors. So here is a different approach > (pseudocode): > > %a_zext = zext %a to i33 # 1 more bit > %b_zext = zext %b to i33 # 1 more bit > %result_zext = add %a_zext, %b_zext > %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) > %overflow = icmp %max_result > @max_i32_value > %result = trunc %result_zext to i32 > > You can imagine how this would work for signed integers, > replacing zext > with sext and umax with smax. > > This depends on an "experimental" API. Can anyone advise on > depending on > this API? Is it a bad idea? Is it about to be promoted to > non-experimental soon? Can anyone advise on how to best > achieve my goal? > > Kind regards, > Andrew > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/47b9ea5a/attachment.html>
Nikita Popov via llvm-dev
2019-Feb-09 17:37 UTC
[llvm-dev] how experimental are the llvm.experimental.vector.reduce.* functions?
On Sat, Feb 9, 2019 at 6:25 PM Simon Pilgrim <llvm-dev at redking.me.uk> wrote:> The add/sub (+mul) overflow intrinsics are being updated to support > vectors to match the related add/sub saturation intrinsics. We haven't > updated the docs yet as legalization, vectorization and various minor bits > of plumbing still need to be finished before it can be officially supported > (Nikita Popov has been looking at the legalization recently). > > Regarding the reduction functions - I think the integer intrinsics at > least are relatively stable and we can probably investigate dropping the > experimental tag before the next release (assuming someone has the time to > take on the work) - it'd be nice to have the SLP vectorizer emit reduction > intrinsics directly for these. > > The floating point intrinsics are trickier as they (may) have stricter > ordering constraints that is still causing issues and may need tweaking > (e.g. see PR36734). >The vector reduction intrinsics still need quite a lot of work. Apart from SplitVecOp, all legalizations are currently missing. This is only noticeable on AArch64 right now, because all other targets expand vector reductions prior to codegen. Nikita> On 09/02/2019 16:17, Sanjay Patel wrote: > > The IR update to allow vector types was here: > https://reviews.llvm.org/D57090 > ...we didn't update the docs at that time because it was not clear what > the backend would do with that, but that might've changed with some of the > more recent patches. > > On Sat, Feb 9, 2019 at 1:42 AM Craig Topper via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I don't think I understand your pseudocode using >> llvm.experimental.vector.reduce.umax. All of the types you showed are >> scalar, but that intrinsic doesn't work on scalars so I'm having a hard >> time understanding what you're trying to do with it. >> llvm.experimental.vector.reduce.umax takes a vector input and returns a >> scalar result. Are you wanting to find if any of the additions overflowed >> or a mask of which addition overflowed? >> >> The sadd.with.overflow intrinsics are in the process of gaining vector >> support if not already complete. Simon Pilgrim made some commits recently. >> I know the documentation in the LangRef hasn't been updated. It will return >> a <X x i1> vector for overflow instead i1 when vectors are used. >> >> ~Craig >> >> >> On Fri, Feb 8, 2019 at 11:03 PM Andrew Kelley via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> I'm interested in using @llvm.experimental.vector.reduce.smax/umax to >>> implement runtime overflow checking for vectors. Here's an example >>> checked addition, without vectors, and then I'll follow the example with >>> what I would do for checked addition with vectors. >>> >>> Frontend code (zig): >>> >>> export fn entry() void { >>> var a: i32 = 1; >>> var b: i32 = 2; >>> var x = a + b; >>> } >>> >>> LLVM IR code: >>> >>> define void @entry() #2 !dbg !41 { >>> Entry: >>> %a = alloca i32, align 4 >>> %b = alloca i32, align 4 >>> %x = alloca i32, align 4 >>> store i32 1, i32* %a, align 4, !dbg !52 >>> call void @llvm.dbg.declare(metadata i32* %a, metadata !45, metadata >>> !DIExpression()), !dbg !52 >>> store i32 2, i32* %b, align 4, !dbg !53 >>> call void @llvm.dbg.declare(metadata i32* %b, metadata !48, metadata >>> !DIExpression()), !dbg !53 >>> %0 = load i32, i32* %a, align 4, !dbg !54 >>> %1 = load i32, i32* %b, align 4, !dbg !55 >>> %2 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %0, i32 %1), >>> !dbg !56 >>> %3 = extractvalue { i32, i1 } %2, 0, !dbg !56 >>> %4 = extractvalue { i32, i1 } %2, 1, !dbg !56 >>> br i1 %4, label %OverflowFail, label %OverflowOk, !dbg !56 >>> >>> OverflowFail: ; preds = %Entry >>> tail call fastcc void @panic(%"[]u8"* @2, %StackTrace* null), !dbg !56 >>> unreachable, !dbg !56 >>> >>> OverflowOk: ; preds = %Entry >>> store i32 %3, i32* %x, align 4, !dbg !57 >>> call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata >>> !DIExpression()), !dbg !57 >>> ret void, !dbg !58 >>> } >>> >>> You can see this takes advantage of @llvm.sadd.with.overflow, which is >>> not available with vectors. So here is a different approach (pseudocode): >>> >>> %a_zext = zext %a to i33 # 1 more bit >>> %b_zext = zext %b to i33 # 1 more bit >>> %result_zext = add %a_zext, %b_zext >>> %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) >>> %overflow = icmp %max_result > @max_i32_value >>> %result = trunc %result_zext to i32 >>> >>> You can imagine how this would work for signed integers, replacing zext >>> with sext and umax with smax. >>> >>> This depends on an "experimental" API. Can anyone advise on depending on >>> this API? Is it a bad idea? Is it about to be promoted to >>> non-experimental soon? Can anyone advise on how to best achieve my goal? >>> >>> Kind regards, >>> Andrew >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190209/73af8193/attachment.html>
Possibly Parallel Threads
- how experimental are the llvm.experimental.vector.reduce.* functions?
- how experimental are the llvm.experimental.vector.reduce.* functions?
- how experimental are the llvm.experimental.vector.reduce.* functions?
- can debug info for coroutines be improved?
- assertion triggered since update to llvm 5