Mehdi AMINI via llvm-dev
2017-Jun-19 16:34 UTC
[llvm-dev] LLVM behavior different depending on function symbol name
using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the call to `ceil` and constant fold: *** IR Dump After Early CSE *** ; Function Attrs: nobuiltin nounwind define i1 @do_test() #2 { Entry: %0 = call fastcc float @ceil(float 0.000000e+00) #6 %1 = call fastcc float @ceil32(float 0.000000e+00) #6 %2 = fcmp fast oeq float 0.000000e+00, %1 ret i1 %2 } So just running `opt -early-cse -debug` seems enough: EarlyCSE Simplify: %0 = call fastcc float @ceil(float 0.000000e+00) #6 to: float 0.000000e+00 I suspect it is not correct from EarlyCSE to do that. -- Mehdi 2017-06-19 9:27 GMT-07:00 Andrew Kelley <superjoe30 at gmail.com>:> Oops. Pressed send on accident. > > With -O3, the module gets rewritten to: > ; Function Attrs: nobuiltin nounwind > define i1 @do_test() local_unnamed_addr #0 !dbg !16 { > Entry: > %x.sroa.0.i.i = alloca i32, align 4 > %x.sroa.0.i.i.i = alloca i32, align 4 > tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, > metadata !21, metadata !28) #3, !dbg !29 > tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, > metadata !32, metadata !28) #3, !dbg !44 > tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, > metadata !28) #3, !dbg !49 > tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, > metadata !28) #3, !dbg !50 > %x.sroa.0.i.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i.i to i8*, !dbg > !51 > call void @llvm.lifetime.start(i64 4, i8* nonnull > %x.sroa.0.i.i.i.0.sroa_cast), !dbg !51 > tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, > metadata !57, metadata !28) #3, !dbg !51 > tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata > !61, metadata !28) #3, !dbg !67 > %x.sroa.0.i.i.i.0.x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i > = load i32, i32* %x.sroa.0.i.i.i, align 4, !dbg !68 > store volatile i32 %x.sroa.0.i.i.i.0.x.sroa.0.i. > i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i, i32* %x.sroa.0.i.i.i, > align 4, !dbg !70 > call void @llvm.lifetime.end(i64 4, i8* nonnull > %x.sroa.0.i.i.i.0.sroa_cast), !dbg !71 > tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, > metadata !32, metadata !28) #3, !dbg !72 > tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, > metadata !28) #3, !dbg !75 > tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, > metadata !28) #3, !dbg !76 > %x.sroa.0.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i to i8*, !dbg !77 > call void @llvm.lifetime.start(i64 4, i8* nonnull > %x.sroa.0.i.i.0.sroa_cast), !dbg !77 > tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, > metadata !57, metadata !28) #3, !dbg !77 > tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata > !61, metadata !28) #3, !dbg !79 > %x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i = load > i32, i32* %x.sroa.0.i.i, align 4, !dbg !80 > store volatile i32 %x.sroa.0.i.i.0.x.sroa.0.i.0. > x.sroa.0.0.x.sroa.0.0.x.0.1.i.i, i32* %x.sroa.0.i.i, align 4, !dbg !81 > call void @llvm.lifetime.end(i64 4, i8* nonnull > %x.sroa.0.i.i.0.sroa_cast), !dbg !82 > ret i1 false, !dbg !83 > } > > > Note the `ret i1 false` at the end. Expected it to return true. > > > On Mon, Jun 19, 2017 at 12:26 PM, Andrew Kelley <superjoe30 at gmail.com> > wrote: > >> >> >> On Mon, Jun 19, 2017 at 12:06 PM, Mehdi AMINI <joker.eph at gmail.com> >> wrote: >> >>> Hi, >>> >>> 2017-06-19 8:45 GMT-07:00 Andrew Kelley via llvm-dev < >>> llvm-dev at lists.llvm.org>: >>> >>>> Greetings, >>>> >>>> I have a Zig implementation of ceil which is emitted into LLVM IR like >>>> this: >>>> >>>> ; Function Attrs: nobuiltin nounwind >>>> define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 { >>>> Entry: >>>> %x = alloca float, align 4 >>>> store float %0, float* %x >>>> call void @llvm.dbg.declare(metadata float* %x, metadata !649, >>>> metadata !494), !dbg !651 >>>> %1 = load float, float* %x, !dbg !652 >>>> %2 = call fastcc float @ceil32(float %1) #8, !dbg !656 >>>> ret float %2, !dbg !657 >>>> } >>>> >>>> Test case: >>>> >>>> test "math.ceil" { >>>> assert(ceil(f32(0.0)) == ceil32(0.0)); >>>> assert(ceil(f64(0.0)) == ceil64(0.0)); >>>> } >>>> >>>> >>>> When I compile with optimizations on, this test case fails. The >>>> optimized code for the test case ends up being a call to panic (assertion >>>> failure), which means that LLVM determined the test failed at compile-time. >>>> >>>> What's strange about this is that if I change the function name from >>>> @ceil to @ceil_asdf (and change the callers) then the test passes. >>>> >>>> So I think LLVM is doing some kind of string comparison on the symbol >>>> name and detecting that it is "ceil" and then having different, undesired >>>> behavior. >>>> >>>> I tried putting `nobuiltin` in the function attributes and at the >>>> callsite, but that did not change anything. >>>> >>>> Any ideas what's going on? >>>> >>> >>> I think it'd be a lot easier to figure if you provide a standalone repro. >>> >> >> Standalone repro: >> >> ; ModuleID = 'test' >> source_filename = "test" >> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-unknown-linux-gnu" >> >> %"[]u8" = type { i8*, i64 } >> >> @__zig_panic_implementation_provided = internal unnamed_addr constant i1 >> true, align 1 >> >> ; Function Attrs: nounwind >> declare void @llvm.debugtrap() #0 >> >> ; Function Attrs: argmemonly nounwind >> declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* >> nocapture readonly, i64, i32, i1) #1 >> >> ; Function Attrs: argmemonly nounwind >> declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, >> i1) #1 >> >> ; Function Attrs: nobuiltin nounwind >> define i1 @do_test() #2 !dbg !16 { >> Entry: >> %0 = call fastcc float @ceil(float 0.000000e+00) #6, !dbg !21 >> %1 = call fastcc float @ceil32(float 0.000000e+00) #6, !dbg !23 >> %2 = fcmp fast oeq float %0, %1, !dbg !24 >> ret i1 %2, !dbg !25 >> } >> >> ; Function Attrs: cold nobuiltin noreturn nounwind >> define linkonce coldcc void @__zig_panic(i8* nonnull readonly, i64) #3 >> !dbg !26 { >> Entry: >> %2 = alloca %"[]u8", align 8 >> %message_ptr = alloca i8*, align 8 >> %message_len = alloca i64, align 8 >> store i8* %0, i8** %message_ptr >> call void @llvm.dbg.declare(metadata i8** %message_ptr, metadata !34, >> metadata !37), !dbg !38 >> store i64 %1, i64* %message_len >> call void @llvm.dbg.declare(metadata i64* %message_len, metadata !35, >> metadata !37), !dbg !39 >> %3 = load i64, i64* %message_len, !dbg !40 >> %4 = load i8*, i8** %message_ptr, !dbg !44 >> %5 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 0, !dbg !44 >> %6 = getelementptr inbounds i8, i8* %4, i64 0, !dbg !44 >> store i8* %6, i8** %5, !dbg !44 >> %7 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 1, !dbg !44 >> %8 = sub nsw i64 %3, 0, !dbg !44 >> store i64 %8, i64* %7, !dbg !44 >> call fastcc void @panic(%"[]u8"* byval %2) #6, !dbg !45 >> unreachable, !dbg !45 >> } >> >> ; Function Attrs: nobuiltin nounwind >> define internal fastcc float @ceil(float) unnamed_addr #2 !dbg !46 { >> Entry: >> %x = alloca float, align 4 >> store float %0, float* %x >> call void @llvm.dbg.declare(metadata float* %x, metadata !51, metadata >> !37), !dbg !53 >> %1 = load float, float* %x, !dbg !54 >> %2 = call fastcc float @ceil32(float %1) #7, !dbg !58 >> ret float %2, !dbg !59 >> } >> >> ; Function Attrs: nobuiltin nounwind >> define internal fastcc float @ceil32(float) unnamed_addr #2 !dbg !60 { >> Entry: >> %x = alloca float, align 4 >> %u = alloca i32, align 4 >> %e = alloca i32, align 4 >> %m = alloca i32, align 4 >> store float %0, float* %x >> call void @llvm.dbg.declare(metadata float* %x, metadata !62, metadata >> !37), !dbg !72 >> %1 = load float, float* %x, !dbg !73 >> %2 = bitcast float %1 to i32, !dbg !74 >> store i32 %2, i32* %u, !dbg !75 >> call void @llvm.dbg.declare(metadata i32* %u, metadata !63, metadata >> !37), !dbg !75 >> %3 = load i32, i32* %u, !dbg !76 >> %4 = lshr i32 %3, 23, !dbg !77 >> %5 = and i32 %4, 255, !dbg !78 >> %6 = sub nsw i32 %5, 127, !dbg !79 >> store i32 %6, i32* %e, !dbg !80 >> call void @llvm.dbg.declare(metadata i32* %e, metadata !67, metadata >> !37), !dbg !80 >> call void @llvm.dbg.declare(metadata i32* %m, metadata !70, metadata >> !37), !dbg !81 >> %7 = load i32, i32* %e, !dbg !82 >> %8 = icmp sge i32 %7, 23, !dbg !84 >> br i1 %8, label %Then, label %Else, !dbg !84 >> >> Then: ; preds = %Entry >> %9 = load float, float* %x, !dbg !85 >> ret float %9, !dbg !87 >> >> Else: ; preds = %Entry >> %10 = load i32, i32* %e, !dbg !88 >> %11 = icmp sge i32 %10, 0, !dbg !89 >> br i1 %11, label %Then1, label %Else2, !dbg !89 >> >> Then1: ; preds = %Else >> %12 = load i32, i32* %e, !dbg !90 >> %13 = lshr i32 8388607, %12, !dbg !92 >> store i32 %13, i32* %m, !dbg !93 >> %14 = load i32, i32* %u, !dbg !94 >> %15 = load i32, i32* %m, !dbg !95 >> %16 = and i32 %14, %15, !dbg !96 >> %17 = icmp eq i32 %16, 0, !dbg !97 >> br i1 %17, label %Then3, label %Else4, !dbg !97 >> >> Else2: ; preds = %Else >> %18 = load float, float* %x, !dbg !98 >> %19 = fadd fast float %18, 0x4770000000000000, !dbg !100 >> call fastcc void @forceEval(float %19) #6, !dbg !101 >> %20 = load i32, i32* %u, !dbg !102 >> %21 = lshr i32 %20, 31, !dbg !103 >> %22 = icmp ne i32 %21, 0, !dbg !104 >> br i1 %22, label %Then5, label %Else6, !dbg !104 >> >> Then3: ; preds = %Then1 >> %23 = load float, float* %x, !dbg !105 >> ret float %23, !dbg !107 >> >> Else4: ; preds = %Then1 >> br label %EndIf, !dbg !108 >> >> Then5: ; preds = %Else2 >> ret float -0.000000e+00, !dbg !109 >> >> Else6: ; preds = %Else2 >> br label %EndIf7, !dbg !111 >> >> EndIf: ; preds = %Else4 >> %24 = load float, float* %x, !dbg !112 >> %25 = fadd fast float %24, 0x4770000000000000, !dbg !113 >> call fastcc void @forceEval(float %25) #6, !dbg !114 >> %26 = load i32, i32* %u, !dbg !115 >> %27 = lshr i32 %26, 31, !dbg !116 >> %28 = icmp eq i32 %27, 0, !dbg !117 >> br i1 %28, label %Then8, label %Else9, !dbg !117 >> >> EndIf7: ; preds = %Else6 >> br label %EndIf11, !dbg !118 >> >> Then8: ; preds = %EndIf >> %29 = load i32, i32* %u, !dbg !119 >> %30 = load i32, i32* %m, !dbg !121 >> %31 = add nuw i32 %29, %30, !dbg !122 >> store i32 %31, i32* %u, !dbg !122 >> br label %EndIf10, !dbg !123 >> >> Else9: ; preds = %EndIf >> br label %EndIf10, !dbg !123 >> >> EndIf10: ; preds = %Else9, %Then8 >> %32 = load i32, i32* %u, !dbg !124 >> %33 = load i32, i32* %m, !dbg !125 >> %34 = xor i32 %33, -1, !dbg !126 >> %35 = and i32 %32, %34, !dbg !127 >> store i32 %35, i32* %u, !dbg !127 >> %36 = load i32, i32* %u, !dbg !128 >> %37 = bitcast i32 %36 to float, !dbg !129 >> br label %EndIf11, !dbg !118 >> >> EndIf11: ; preds = %EndIf10, >> %EndIf7 >> %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !118 >> ret float %38, !dbg !130 >> } >> >> ; Function Attrs: nobuiltin noreturn nounwind >> define internal fastcc void @panic(%"[]u8"* byval nonnull readonly) >> unnamed_addr #4 !dbg !131 { >> Entry: >> call void @llvm.dbg.declare(metadata %"[]u8"* %0, metadata !141, >> metadata !37), !dbg !142 >> call void @llvm.debugtrap(), !dbg !143 >> br label %WhileCond, !dbg !146 >> >> WhileCond: ; preds = %WhileCond, >> %Entry >> br label %WhileCond, !dbg !146 >> } >> >> ; Function Attrs: nobuiltin nounwind >> define internal fastcc void @forceEval(float) unnamed_addr #2 !dbg !147 { >> Entry: >> %value = alloca float, align 4 >> %x = alloca float, align 4 >> %p = alloca float*, align 8 >> store float %0, float* %value >> call void @llvm.dbg.declare(metadata float* %value, metadata !151, >> metadata !37), !dbg !158 >> call void @llvm.dbg.declare(metadata float* %x, metadata !152, metadata >> !37), !dbg !159 >> store float* %x, float** %p, !dbg !160 >> call void @llvm.dbg.declare(metadata float** %p, metadata !155, >> metadata !37), !dbg !160 >> %1 = load float*, float** %p, !dbg !161 >> %2 = load float, float* %x, !dbg !163 >> store volatile float %2, float* %1, !dbg !164 >> ret void, !dbg !165 >> } >> >> ; Function Attrs: nounwind readnone >> declare void @llvm.dbg.declare(metadata, metadata, metadata) #5 >> >> attributes #0 = { nounwind } >> attributes #1 = { argmemonly nounwind } >> attributes #2 = { nobuiltin nounwind } >> attributes #3 = { cold nobuiltin noreturn nounwind } >> attributes #4 = { nobuiltin noreturn nounwind } >> attributes #5 = { nounwind readnone } >> attributes #6 = { nobuiltin } >> attributes #7 = { alwaysinline nobuiltin } >> >> !llvm.module.flags = !{!0} >> !llvm.dbg.cu = !{!1} >> >> !0 = !{i32 2, !"Debug Info Version", i32 3} >> !1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: >> "zig 0.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, >> enums: !3, globals: !12) >> !2 = !DIFile(filename: "test", directory: ".") >> !3 = !{!4} >> !4 = !DICompositeType(tag: DW_TAG_enumeration_type, name: >> "GlobalLinkage", scope: !5, file: !5, line: 126, baseType: !6, size: 8, >> align: 8, elements: !7) >> !5 = !DIFile(filename: "builtin.zig", directory: >> "/home/andy/dev/zig/build/zig-cache") >> !6 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned_char) >> !7 = !{!8, !9, !10, !11} >> !8 = !DIEnumerator(name: "Internal", value: 0) >> !9 = !DIEnumerator(name: "Strong", value: 1) >> !10 = !DIEnumerator(name: "Weak", value: 2) >> !11 = !DIEnumerator(name: "LinkOnce", value: 3) >> !12 = !{!13} >> !13 = !DIGlobalVariableExpression(var: !14) >> !14 = distinct !DIGlobalVariable(name: "__zig_panic_implementation_provided", >> linkageName: "__zig_panic_implementation_provided", scope: !5, file: !5, >> line: 189, type: !15, isLocal: true, isDefinition: true) >> !15 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean) >> !16 = distinct !DISubprogram(name: "do_test", scope: !17, file: !17, >> line: 46, type: !18, isLocal: false, isDefinition: true, scopeLine: 46, >> isOptimized: true, unit: !1, variables: !20) >> !17 = !DIFile(filename: "test.zig", directory: "/home/andy/dev/zig/build") >> !18 = !DISubroutineType(types: !19) >> !19 = !{!15} >> !20 = !{} >> !21 = !DILocation(line: 47, column: 16, scope: !22) >> !22 = distinct !DILexicalBlock(scope: !16, file: !17, line: 46, column: >> 29) >> !23 = !DILocation(line: 47, column: 36, scope: !22) >> !24 = !DILocation(line: 47, column: 27, scope: !22) >> !25 = !DILocation(line: 47, column: 5, scope: !22) >> !26 = distinct !DISubprogram(name: "__zig_panic", scope: !27, file: !27, >> line: 7, type: !28, isLocal: false, isDefinition: true, scopeLine: 7, >> isOptimized: true, unit: !1, variables: !33) >> !27 = !DIFile(filename: "zigrt.zig", directory: >> "/home/andy/dev/zig/build/lib/zig/std/special") >> !28 = !DISubroutineType(types: !29) >> !29 = !{!30, !31, !32} >> !30 = !DIBasicType(name: "void", encoding: DW_ATE_unsigned) >> !31 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const u8", >> baseType: !6, size: 64, align: 64) >> !32 = !DIBasicType(name: "usize", size: 64, encoding: DW_ATE_unsigned) >> !33 = !{!34, !35} >> !34 = !DILocalVariable(name: "message_ptr", arg: 1, scope: !26, file: >> !27, line: 7, type: !31) >> !35 = !DILocalVariable(name: "message_len", arg: 2, scope: !36, file: >> !27, line: 7, type: !32) >> !36 = distinct !DILexicalBlock(scope: !26, file: !27, line: 7, column: 30) >> !37 = !DIExpression() >> !38 = !DILocation(line: 7, column: 30, scope: !26) >> !39 = !DILocation(line: 7, column: 54, scope: !36) >> !40 = !DILocation(line: 12, column: 48, scope: !41) >> !41 = distinct !DILexicalBlock(scope: !42, file: !27, line: 11, column: >> 54) >> !42 = distinct !DILexicalBlock(scope: !43, file: !27, line: 7, column: 86) >> !43 = distinct !DILexicalBlock(scope: !36, file: !27, line: 7, column: 54) >> !44 = !DILocation(line: 12, column: 43, scope: !41) >> !45 = !DILocation(line: 12, column: 31, scope: !41) >> !46 = distinct !DISubprogram(name: "ceil", scope: !17, file: !17, line: >> 3, type: !47, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: >> true, unit: !1, variables: !50) >> !47 = !DISubroutineType(types: !48) >> !48 = !{!49, !49} >> !49 = !DIBasicType(name: "f32", size: 32, encoding: DW_ATE_float) >> !50 = !{!51} >> !51 = !DILocalVariable(name: "x", arg: 1, scope: !52, file: !17, line: 3, >> type: !49) >> !52 = distinct !DILexicalBlock(scope: !46, file: !17, line: 3, column: 13) >> !53 = !DILocation(line: 3, column: 13, scope: !52) >> !54 = !DILocation(line: 6, column: 36, scope: !55) >> !55 = distinct !DILexicalBlock(scope: !56, file: !17, line: 4, column: 5) >> !56 = distinct !DILexicalBlock(scope: !57, file: !17, line: 3, column: 35) >> !57 = distinct !DILexicalBlock(scope: !52, file: !17, line: 3, column: 13) >> !58 = !DILocation(line: 6, column: 16, scope: !55) >> !59 = !DILocation(line: 5, column: 5, scope: !57) >> !60 = distinct !DISubprogram(name: "ceil32", scope: !17, file: !17, line: >> 11, type: !47, isLocal: true, isDefinition: true, scopeLine: 11, >> isOptimized: true, unit: !1, variables: !61) >> !61 = !{!62, !63, !67, !70} >> !62 = !DILocalVariable(name: "x", arg: 1, scope: !60, file: !17, line: >> 11, type: !49) >> !63 = !DILocalVariable(name: "u", scope: !64, file: !17, line: 12, type: >> !66) >> !64 = distinct !DILexicalBlock(scope: !65, file: !17, line: 11, column: >> 26) >> !65 = distinct !DILexicalBlock(scope: !60, file: !17, line: 11, column: >> 11) >> !66 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned) >> !67 = !DILocalVariable(name: "e", scope: !68, file: !17, line: 13, type: >> !69) >> !68 = distinct !DILexicalBlock(scope: !64, file: !17, line: 12, column: 5) >> !69 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed) >> !70 = !DILocalVariable(name: "m", scope: !71, file: !17, line: 14, type: >> !66) >> !71 = distinct !DILexicalBlock(scope: !68, file: !17, line: 13, column: 5) >> !72 = !DILocation(line: 11, column: 11, scope: !60) >> !73 = !DILocation(line: 12, column: 27, scope: !64) >> !74 = !DILocation(line: 12, column: 13, scope: !64) >> !75 = !DILocation(line: 12, column: 5, scope: !64) >> !76 = !DILocation(line: 13, column: 18, scope: !68) >> !77 = !DILocation(line: 13, column: 20, scope: !68) >> !78 = !DILocation(line: 13, column: 27, scope: !68) >> !79 = !DILocation(line: 13, column: 35, scope: !68) >> !80 = !DILocation(line: 13, column: 5, scope: !68) >> !81 = !DILocation(line: 14, column: 5, scope: !71) >> !82 = !DILocation(line: 16, column: 9, scope: !83) >> !83 = distinct !DILexicalBlock(scope: !71, file: !17, line: 14, column: 5) >> !84 = !DILocation(line: 16, column: 11, scope: !83) >> !85 = !DILocation(line: 17, column: 16, scope: !86) >> !86 = distinct !DILexicalBlock(scope: !83, file: !17, line: 16, column: >> 18) >> !87 = !DILocation(line: 17, column: 9, scope: !86) >> !88 = !DILocation(line: 19, column: 14, scope: !83) >> !89 = !DILocation(line: 19, column: 16, scope: !83) >> !90 = !DILocation(line: 20, column: 31, scope: !91) >> !91 = distinct !DILexicalBlock(scope: !83, file: !17, line: 19, column: >> 22) >> !92 = !DILocation(line: 20, column: 24, scope: !91) >> !93 = !DILocation(line: 20, column: 11, scope: !91) >> !94 = !DILocation(line: 21, column: 13, scope: !91) >> !95 = !DILocation(line: 21, column: 17, scope: !91) >> !96 = !DILocation(line: 21, column: 15, scope: !91) >> !97 = !DILocation(line: 21, column: 19, scope: !91) >> !98 = !DILocation(line: 31, column: 19, scope: !99) >> !99 = distinct !DILexicalBlock(scope: !83, file: !17, line: 30, column: >> 12) >> !100 = !DILocation(line: 31, column: 21, scope: !99) >> !101 = !DILocation(line: 31, column: 18, scope: !99) >> !102 = !DILocation(line: 32, column: 13, scope: !99) >> !103 = !DILocation(line: 32, column: 15, scope: !99) >> !104 = !DILocation(line: 32, column: 21, scope: !99) >> !105 = !DILocation(line: 22, column: 20, scope: !106) >> !106 = distinct !DILexicalBlock(scope: !91, file: !17, line: 21, column: >> 25) >> !107 = !DILocation(line: 22, column: 13, scope: !106) >> !108 = !DILocation(line: 21, column: 9, scope: !91) >> !109 = !DILocation(line: 33, column: 13, scope: !110) >> !110 = distinct !DILexicalBlock(scope: !99, file: !17, line: 32, column: >> 27) >> !111 = !DILocation(line: 32, column: 9, scope: !99) >> !112 = !DILocation(line: 24, column: 19, scope: !91) >> !113 = !DILocation(line: 24, column: 21, scope: !91) >> !114 = !DILocation(line: 24, column: 18, scope: !91) >> !115 = !DILocation(line: 25, column: 13, scope: !91) >> !116 = !DILocation(line: 25, column: 15, scope: !91) >> !117 = !DILocation(line: 25, column: 21, scope: !91) >> !118 = !DILocation(line: 19, column: 10, scope: !83) >> !119 = !DILocation(line: 26, column: 13, scope: !120) >> !120 = distinct !DILexicalBlock(scope: !91, file: !17, line: 25, column: >> 27) >> !121 = !DILocation(line: 26, column: 18, scope: !120) >> !122 = !DILocation(line: 26, column: 15, scope: !120) >> !123 = !DILocation(line: 25, column: 9, scope: !91) >> !124 = !DILocation(line: 28, column: 9, scope: !91) >> !125 = !DILocation(line: 28, column: 15, scope: !91) >> !126 = !DILocation(line: 28, column: 14, scope: !91) >> !127 = !DILocation(line: 28, column: 11, scope: !91) >> !128 = !DILocation(line: 29, column: 23, scope: !91) >> !129 = !DILocation(line: 29, column: 9, scope: !91) >> !130 = !DILocation(line: 16, column: 5, scope: !65) >> !131 = distinct !DISubprogram(name: "panic", scope: !17, file: !17, line: >> 1, type: !132, isLocal: true, isDefinition: true, scopeLine: 1, >> isOptimized: true, unit: !1, variables: !140) >> !132 = !DISubroutineType(types: !133) >> !133 = !{!30, !134} >> !134 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const []const >> u8", baseType: !135, size: 64, align: 64) >> !135 = !DICompositeType(tag: DW_TAG_structure_type, name: "[]u8", size: >> 128, align: 128, elements: !136) >> !136 = !{!137, !139} >> !137 = !DIDerivedType(tag: DW_TAG_member, name: "ptr", scope: !135, >> baseType: !138, size: 64, align: 64) >> !138 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: >> !6, size: 64, align: 64) >> !139 = !DIDerivedType(tag: DW_TAG_member, name: "len", scope: !135, >> baseType: !32, size: 64, align: 64, offset: 64) >> !140 = !{!141} >> !141 = !DILocalVariable(name: "msg", arg: 1, scope: !131, file: !17, >> line: 1, type: !135) >> !142 = !DILocation(line: 1, column: 14, scope: !131) >> !143 = !DILocation(line: 1, column: 45, scope: !144) >> !144 = distinct !DILexicalBlock(scope: !145, file: !17, line: 1, column: >> 43) >> !145 = distinct !DILexicalBlock(scope: !131, file: !17, line: 1, column: >> 14) >> !146 = !DILocation(line: 1, column: 60, scope: !144) >> !147 = distinct !DISubprogram(name: "forceEval", scope: !17, file: !17, >> line: 39, type: !148, isLocal: true, isDefinition: true, scopeLine: 39, >> isOptimized: true, unit: !1, variables: !150) >> !148 = !DISubroutineType(types: !149) >> !149 = !{!30, !49} >> !150 = !{!151, !152, !155} >> !151 = !DILocalVariable(name: "value", arg: 1, scope: !147, file: !17, >> line: 39, type: !49) >> !152 = !DILocalVariable(name: "x", scope: !153, file: !17, line: 40, >> type: !49) >> !153 = distinct !DILexicalBlock(scope: !154, file: !17, line: 39, column: >> 30) >> !154 = distinct !DILexicalBlock(scope: !147, file: !17, line: 39, column: >> 18) >> !155 = !DILocalVariable(name: "p", scope: !156, file: !17, line: 41, >> type: !157) >> !156 = distinct !DILexicalBlock(scope: !153, file: !17, line: 40, column: >> 5) >> !157 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&volatile f32", >> baseType: !49, size: 64, align: 64) >> !158 = !DILocation(line: 39, column: 18, scope: !147) >> !159 = !DILocation(line: 40, column: 5, scope: !153) >> !160 = !DILocation(line: 41, column: 5, scope: !156) >> !161 = !DILocation(line: 42, column: 5, scope: !162) >> !162 = distinct !DILexicalBlock(scope: !156, file: !17, line: 41, column: >> 5) >> !163 = !DILocation(line: 42, column: 10, scope: !162) >> !164 = !DILocation(line: 42, column: 8, scope: !162) >> !165 = !DILocation(line: 39, column: 30, scope: !154) >> >> source: >> pub fn panic(msg: []const u8) -> noreturn { @breakpoint(); while (true) >> {} } >> >> pub fn ceil(x: var) -> @typeOf(x) { >> const T = @typeOf(x); >> switch (T) { >> f32 => @inlineCall(ceil32, x), >> else => @compileError("ceil not implemented for " ++ >> @typeName(T)), >> } >> } >> >> fn ceil32(x: f32) -> f32 { >> var u = @bitCast(u32, x); >> var e = i32((u >> 23) & 0xFF) - 0x7F; >> var m: u32 = undefined; >> >> if (e >= 23) { >> return x; >> } >> else if (e >= 0) { >> m = 0x007FFFFF >> u32(e); >> if (u & m == 0) { >> return x; >> } >> forceEval(x + 0x1.0p120); >> if (u >> 31 == 0) { >> u += m; >> } >> u &= ~m; >> @bitCast(f32, u) >> } else { >> forceEval(x + 0x1.0p120); >> if (u >> 31 != 0) { >> return -0.0; >> } else { >> 1.0 >> } >> } >> } >> pub fn forceEval(value: f32) { >> var x: f32 = undefined; >> const p = @ptrCast(&volatile f32, &x); >> *p = x; >> } >> >> >> export fn do_test() -> bool { >> return ceil(f32(0.0)) == ceil32(0.0); >> } >> >> >> ----------------------- >> >> With no optimizations, the do_test function returns true, which is >> expected. With -O3, the module gets rewritten to: >> >> >> >> >> >>> >>> -- >>> Mehdi >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/8b4917f8/attachment.html>
James Y Knight via llvm-dev
2017-Jun-19 16:47 UTC
[llvm-dev] LLVM behavior different depending on function symbol name
On Mon, Jun 19, 2017 at 12:34 PM, Mehdi AMINI via llvm-dev < llvm-dev at lists.llvm.org> wrote:> using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the > call to `ceil` and constant fold: > > *** IR Dump After Early CSE *** > ; Function Attrs: nobuiltin nounwind > define i1 @do_test() #2 { > Entry: > %0 = call fastcc float @ceil(float 0.000000e+00) #6 > %1 = call fastcc float @ceil32(float 0.000000e+00) #6 > %2 = fcmp fast oeq float 0.000000e+00, %1 > ret i1 %2 > } > > So just running `opt -early-cse -debug` seems enough: > > EarlyCSE Simplify: %0 = call fastcc float @ceil(float 0.000000e+00) #6 > to: float 0.000000e+00 > > I suspect it is not correct from EarlyCSE to do that. > >This was actually _just_ fixed: https://reviews.llvm.org/rL305132 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/3fd1c227/attachment.html>
Andrew Kelley via llvm-dev
2017-Jun-19 17:42 UTC
[llvm-dev] LLVM behavior different depending on function symbol name
On Mon, Jun 19, 2017 at 12:47 PM, James Y Knight <jyknight at google.com> wrote:> > > On Mon, Jun 19, 2017 at 12:34 PM, Mehdi AMINI via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the >> call to `ceil` and constant fold: >> >> *** IR Dump After Early CSE *** >> ; Function Attrs: nobuiltin nounwind >> define i1 @do_test() #2 { >> Entry: >> %0 = call fastcc float @ceil(float 0.000000e+00) #6 >> %1 = call fastcc float @ceil32(float 0.000000e+00) #6 >> %2 = fcmp fast oeq float 0.000000e+00, %1 >> ret i1 %2 >> } >> >> So just running `opt -early-cse -debug` seems enough: >> >> EarlyCSE Simplify: %0 = call fastcc float @ceil(float 0.000000e+00) #6 >> to: float 0.000000e+00 >> >> I suspect it is not correct from EarlyCSE to do that. >> >> > This was actually _just_ fixed: > https://reviews.llvm.org/rL305132 >Excellent. Is the fix included in llvm 4.0.1? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/64c8a810/attachment.html>