thr3ads.net - llvm dev - [llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features [Oct 2017]

If this information is useful, please help other people find it:
Share via:

Andrew Kelley via llvm-dev

2017-Oct-03 04:14 UTC

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features

I figured it out. I was using this implementation of __chkstk from
compiler-rt:

DEFINE_COMPILERRT_FUNCTION(___chkstk)
        push   %rcx
        cmp    $0x1000,%rax
        lea    16(%rsp),%rcx     // rsp before calling this routine -> rcx
        jb     1f
2:
        sub    $0x1000,%rcx
        test   %rcx,(%rcx)
        sub    $0x1000,%rax
        cmp    $0x1000,%rax
        ja     2b
1:
        sub    %rax,%rcx
        test   %rcx,(%rcx)

        lea    8(%rsp),%rax     // load pointer to the return address into
rax
        mov    %rcx,%rsp        // install the new top of stack pointer
into rsp
        mov    -8(%rax),%rcx    // restore rcx
        push   (%rax)           // push return address onto the stack
        sub    %rsp,%rax        // restore the original value in rax
        ret
END_COMPILERRT_FUNCTION(___chkstk)

(source
https://github.com/llvm-project/llvm-project-20170507/blob/release_50/compiler-rt/lib/builtins/x86_64/chkstk2.S
)

When I replaced it with a simple `ret`, everything worked.

The disassembled ntdll implementation is:

__chkstk:
1800a9f60:  48 83 ec 10     subq    $16, %rsp
1800a9f64:  4c 89 14 24     movq    %r10, (%rsp)
1800a9f68:  4c 89 5c 24 08  movq    %r11, 8(%rsp)
1800a9f6d:  4d 33 db    xorq    %r11, %r11
1800a9f70:  4c 8d 54 24 18  leaq    24(%rsp), %r10
1800a9f75:  4c 2b d0    subq    %rax, %r10
1800a9f78:  4d 0f 42 d3     cmovbq  %r11, %r10
1800a9f7c:  65 4c 8b 1c 25 10 00 00 00  movq    %gs:16, %r11
1800a9f85:  4d 3b d3    cmpq    %r11, %r10
1800a9f88:  73 15   jae 21 <__chkstk+0x3F>
1800a9f8a:  66 41 81 e2 00 f0   andw    $61440, %r10w
1800a9f90:  4d 8d 9b 00 f0 ff ff    leaq    -4096(%r11), %r11
1800a9f97:  45 84 1b    testb   (%r11), %r11b
1800a9f9a:  4d 3b d3    cmpq    %r11, %r10
1800a9f9d:  75 f1   jne -15 <__chkstk+0x30>
1800a9f9f:  4c 8b 14 24     movq    (%rsp), %r10
1800a9fa3:  4c 8b 5c 24 08  movq    8(%rsp), %r11
1800a9fa8:  48 83 c4 10     addq    $16, %rsp

On Mon, Oct 2, 2017 at 1:37 PM, Reid Kleckner <rnk at google.com> wrote:
> Can you post test.obj somewhere, and maybe the LLVM IR if you can get it?
> If it really was reading address 0xFFFFFFFFFFFFFFFF, then RBP must have
> been completely corrupted, probably by the prologue.
>
> On Sat, Sep 30, 2017 at 6:27 PM, Andrew Kelley via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I suspect that there are 2 issues here:
>>
>>  * I have incorrect alignment somewhere
>>  * MSVC / .pdb / CodeView debugging is not working correctly.
>>
>> I think the latter would help solve the former.
>>
>> I will send out a new email later talking about the issues I'm
having
>> debugging llvm-generated binaries with MSVC.
>>
>> On Sat, Sep 30, 2017 at 3:33 PM, Andrew Kelley <superjoe30 at
gmail.com>
>> wrote:
>>
>>> I have this code, which works fine on MacOS and Linux hosts:
>>>
>>>     const char *target_specific_cpu_args;
>>>     const char *target_specific_features;
>>>     if (g->is_native_target) {
>>>         target_specific_cpu_args = ZigLLVMGetHostCPUName();
>>>         target_specific_features = ZigLLVMGetNativeFeatures();
>>>     } else {
>>>         target_specific_cpu_args = "";
>>>         target_specific_features = "";
>>>     }
>>>
>>>     g->target_machine = LLVMCreateTargetMachine(target_ref,
>>> buf_ptr(&g->triple_str),
>>>             target_specific_cpu_args, target_specific_features,
>>> opt_level, reloc_mode, LLVMCodeModelDefault);
>>>
>>>
>>>
>>> char *ZigLLVMGetHostCPUName(void) {
>>>     std::string str = sys::getHostCPUName();
>>>     return strdup(str.c_str());
>>> }
>>>
>>> char *ZigLLVMGetNativeFeatures(void) {
>>>     SubtargetFeatures features;
>>>
>>>     StringMap<bool> host_features;
>>>     if (sys::getHostCPUFeatures(host_features)) {
>>>         for (auto &F : host_features)
>>>             features.AddFeature(F.first(), F.second);
>>>     }
>>>
>>>     return strdup(features.getString().c_str());
>>> }
>>>
>>> On this windows laptop that I am testing on, I get these values:
>>>
>>> target_specific_cpu_args: skylake
>>>
>>> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
>>> avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
>>> +xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-p
>>> ku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsav
>>> e,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+
>>> sse4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+
>>> f16c,+ssse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-
>>> sha,+adx,-avx512pf,+sse3
>>>
>>>
>>> It successfully creates a binary, but the binary when run crashes
with:
>>>
>>> Unhandled exception at 0x00007FF7C9913BA7 in test.exe: 0xC0000005:
>>> Access violation reading location 0xFFFFFFFFFFFFFFFF.
>>>
>>> The disassembly of the crashed instruction is:
>>>
>>> 00007FF7C9913BA7  vmovdqa     xmmword ptr [rbp-20h],xmm0
>>>
>>> There is no callstack or source in the MSVC debugger. The .pdb
produced
>>> is 64KB exactly. The file was linked with:
>>>
>>> lld -NOLOGO -DEBUG -MACHINE:X64 /SUBSYSTEM:console -OUT:.\test.exe
>>> -NODEFAULTLIB -ENTRY:_start ./zig-cache/test.obj
./zig-cache/builtin.obj
>>> ./zig-cache/compiler_rt.obj ./zig-cache/kernel32.lib
>>>
>>>
>>> When I change the call to LLVMCreateTargetMachine so that both
>>> target_specific_cpu_args  and target_specific_features  are the
empty
>>> string, the produced binary is valid and runs successfully.
>>>
>>> Is this an LLVM bug? Am I using the API incorrectly? Is there more
>>> information I can provide to LLVM-dev mailing list that would make
it
>>> easier to help me?
>>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171003/e7903fc7/attachment-0001.html>

Andrew Kelley via llvm-dev

2017-Oct-03 04:34 UTC

head link

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features

I tried __chkstk_ms from compiler-rt which has this definition:

DEFINE_COMPILERRT_FUNCTION(___chkstk_ms)
        push   %rcx
        push   %rax
        cmp    $0x1000,%rax
        lea    24(%rsp),%rcx
        jb     1f
2:
        sub    $0x1000,%rcx
        test   %rcx,(%rcx)
        sub    $0x1000,%rax
        cmp    $0x1000,%rax
        ja     2b
1:
        sub    %rax,%rcx
        test   %rcx,(%rcx)
        pop    %rax
        pop    %rcx
        ret
END_COMPILERRT_FUNCTION(___chkstk_ms)


except I called it __chkstk since that's the symbol that LLVM generated a
dependency on. And it passed all my tests, with optimizations on and off.

Can anyone shed some light on this?

On Tue, Oct 3, 2017 at 12:14 AM, Andrew Kelley <superjoe30 at gmail.com>
wrote:
> I figured it out. I was using this implementation of __chkstk from
> compiler-rt:
>
> DEFINE_COMPILERRT_FUNCTION(___chkstk)
>         push   %rcx
>         cmp    $0x1000,%rax
>         lea    16(%rsp),%rcx     // rsp before calling this routine ->
rcx
>         jb     1f
> 2:
>         sub    $0x1000,%rcx
>         test   %rcx,(%rcx)
>         sub    $0x1000,%rax
>         cmp    $0x1000,%rax
>         ja     2b
> 1:
>         sub    %rax,%rcx
>         test   %rcx,(%rcx)
>
>         lea    8(%rsp),%rax     // load pointer to the return address into
> rax
>         mov    %rcx,%rsp        // install the new top of stack pointer
> into rsp
>         mov    -8(%rax),%rcx    // restore rcx
>         push   (%rax)           // push return address onto the stack
>         sub    %rsp,%rax        // restore the original value in rax
>         ret
> END_COMPILERRT_FUNCTION(___chkstk)
>
> (source https://github.com/llvm-project/llvm-project-
> 20170507/blob/release_50/compiler-rt/lib/builtins/x86_64/chkstk2.S)
>
> When I replaced it with a simple `ret`, everything worked.
>
> The disassembled ntdll implementation is:
>
> __chkstk:
> 1800a9f60:  48 83 ec 10     subq    $16, %rsp
> 1800a9f64:  4c 89 14 24     movq    %r10, (%rsp)
> 1800a9f68:  4c 89 5c 24 08  movq    %r11, 8(%rsp)
> 1800a9f6d:  4d 33 db    xorq    %r11, %r11
> 1800a9f70:  4c 8d 54 24 18  leaq    24(%rsp), %r10
> 1800a9f75:  4c 2b d0    subq    %rax, %r10
> 1800a9f78:  4d 0f 42 d3     cmovbq  %r11, %r10
> 1800a9f7c:  65 4c 8b 1c 25 10 00 00 00  movq    %gs:16, %r11
> 1800a9f85:  4d 3b d3    cmpq    %r11, %r10
> 1800a9f88:  73 15   jae 21 <__chkstk+0x3F>
> 1800a9f8a:  66 41 81 e2 00 f0   andw    $61440, %r10w
> 1800a9f90:  4d 8d 9b 00 f0 ff ff    leaq    -4096(%r11), %r11
> 1800a9f97:  45 84 1b    testb   (%r11), %r11b
> 1800a9f9a:  4d 3b d3    cmpq    %r11, %r10
> 1800a9f9d:  75 f1   jne -15 <__chkstk+0x30>
> 1800a9f9f:  4c 8b 14 24     movq    (%rsp), %r10
> 1800a9fa3:  4c 8b 5c 24 08  movq    8(%rsp), %r11
> 1800a9fa8:  48 83 c4 10     addq    $16, %rsp
>
> On Mon, Oct 2, 2017 at 1:37 PM, Reid Kleckner <rnk at google.com>
wrote:
>
>> Can you post test.obj somewhere, and maybe the LLVM IR if you can get
it?
>> If it really was reading address 0xFFFFFFFFFFFFFFFF, then RBP must have
>> been completely corrupted, probably by the prologue.
>>
>> On Sat, Sep 30, 2017 at 6:27 PM, Andrew Kelley via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> I suspect that there are 2 issues here:
>>>
>>>  * I have incorrect alignment somewhere
>>>  * MSVC / .pdb / CodeView debugging is not working correctly.
>>>
>>> I think the latter would help solve the former.
>>>
>>> I will send out a new email later talking about the issues I'm
having
>>> debugging llvm-generated binaries with MSVC.
>>>
>>> On Sat, Sep 30, 2017 at 3:33 PM, Andrew Kelley <superjoe30 at
gmail.com>
>>> wrote:
>>>
>>>> I have this code, which works fine on MacOS and Linux hosts:
>>>>
>>>>     const char *target_specific_cpu_args;
>>>>     const char *target_specific_features;
>>>>     if (g->is_native_target) {
>>>>         target_specific_cpu_args = ZigLLVMGetHostCPUName();
>>>>         target_specific_features = ZigLLVMGetNativeFeatures();
>>>>     } else {
>>>>         target_specific_cpu_args = "";
>>>>         target_specific_features = "";
>>>>     }
>>>>
>>>>     g->target_machine = LLVMCreateTargetMachine(target_ref,
>>>> buf_ptr(&g->triple_str),
>>>>             target_specific_cpu_args, target_specific_features,
>>>> opt_level, reloc_mode, LLVMCodeModelDefault);
>>>>
>>>>
>>>>
>>>> char *ZigLLVMGetHostCPUName(void) {
>>>>     std::string str = sys::getHostCPUName();
>>>>     return strdup(str.c_str());
>>>> }
>>>>
>>>> char *ZigLLVMGetNativeFeatures(void) {
>>>>     SubtargetFeatures features;
>>>>
>>>>     StringMap<bool> host_features;
>>>>     if (sys::getHostCPUFeatures(host_features)) {
>>>>         for (auto &F : host_features)
>>>>             features.AddFeature(F.first(), F.second);
>>>>     }
>>>>
>>>>     return strdup(features.getString().c_str());
>>>> }
>>>>
>>>> On this windows laptop that I am testing on, I get these
values:
>>>>
>>>> target_specific_cpu_args: skylake
>>>>
>>>> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
>>>> avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
>>>> +xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-p
>>>> ku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsav
>>>> e,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+ss
>>>> e4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+f16c,+
>>>>
ssse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-sha,+adx,-avx512pf,+sse3
>>>>
>>>>
>>>> It successfully creates a binary, but the binary when run
crashes with:
>>>>
>>>> Unhandled exception at 0x00007FF7C9913BA7 in test.exe:
0xC0000005:
>>>> Access violation reading location 0xFFFFFFFFFFFFFFFF.
>>>>
>>>> The disassembly of the crashed instruction is:
>>>>
>>>> 00007FF7C9913BA7  vmovdqa     xmmword ptr [rbp-20h],xmm0
>>>>
>>>> There is no callstack or source in the MSVC debugger. The .pdb
produced
>>>> is 64KB exactly. The file was linked with:
>>>>
>>>> lld -NOLOGO -DEBUG -MACHINE:X64 /SUBSYSTEM:console
-OUT:.\test.exe
>>>> -NODEFAULTLIB -ENTRY:_start ./zig-cache/test.obj
./zig-cache/builtin.obj
>>>> ./zig-cache/compiler_rt.obj ./zig-cache/kernel32.lib
>>>>
>>>>
>>>> When I change the call to LLVMCreateTargetMachine so that both
>>>> target_specific_cpu_args  and target_specific_features  are the
empty
>>>> string, the produced binary is valid and runs successfully.
>>>>
>>>> Is this an LLVM bug? Am I using the API incorrectly? Is there
more
>>>> information I can provide to LLVM-dev mailing list that would
make it
>>>> easier to help me?
>>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171003/0a6594ee/attachment.html>

Andrew Kelley via llvm-dev

2017-Oct-03 04:55 UTC

head link

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features

The crashes are gone, but I'm still getting weird behavior with cpu native
features turned on. Example:


const assert = @import("std").debug.assert;
test "f128" {
    if (make_f128(1.0) == 1.1) @panic("wrong");
}
fn make_f128(x: f128) -> f128 { x }



; Function Attrs: nobuiltin nounwind
define internal fastcc fp128 @make_f128(fp128) unnamed_addr #0 !dbg !357 {
Entry:
  %x = alloca fp128, align 16
  store fp128 %0, fp128* %x, align 16
  call void @llvm.dbg.declare(metadata fp128* %x, metadata !362, metadata
!219), !dbg !363
  %1 = load fp128, fp128* %x, align 16, !dbg !364
  ret fp128 %1, !dbg !367
}

; Function Attrs: nobuiltin nounwind
define fastcc void @f128() #0 !dbg !312 {
Entry:
  %0 = call fastcc fp128 @make_f128(fp128
0xL00000000000000003FFF000000000000), !dbg !315
  %1 = fcmp fast oeq fp128 %0, 0xLA0000000000000003FFF199999999999, !dbg
!317
  br i1 %1, label %Then, label %Else, !dbg !317

Then:                                             ; preds = %Entry
  call void @panic(%"[]u8"* bitcast ({ i8*, i64 }* @7 to
%"[]u8"*)), !dbg
!318
  unreachable, !dbg !318

Else:                                             ; preds = %Entry
  ret void, !dbg !319
}


This is calling the panic function, when clearly these f128 floats do not
equal each other. When I revert to not using target-native features, the
test passes.


On Tue, Oct 3, 2017 at 12:34 AM, Andrew Kelley <superjoe30 at gmail.com>
wrote:
> I tried __chkstk_ms from compiler-rt which has this definition:
>
> DEFINE_COMPILERRT_FUNCTION(___chkstk_ms)
>         push   %rcx
>         push   %rax
>         cmp    $0x1000,%rax
>         lea    24(%rsp),%rcx
>         jb     1f
> 2:
>         sub    $0x1000,%rcx
>         test   %rcx,(%rcx)
>         sub    $0x1000,%rax
>         cmp    $0x1000,%rax
>         ja     2b
> 1:
>         sub    %rax,%rcx
>         test   %rcx,(%rcx)
>         pop    %rax
>         pop    %rcx
>         ret
> END_COMPILERRT_FUNCTION(___chkstk_ms)
>
>
> except I called it __chkstk since that's the symbol that LLVM generated
a
> dependency on. And it passed all my tests, with optimizations on and off.
>
> Can anyone shed some light on this?
>
> On Tue, Oct 3, 2017 at 12:14 AM, Andrew Kelley <superjoe30 at
gmail.com>
> wrote:
>
>> I figured it out. I was using this implementation of __chkstk from
>> compiler-rt:
>>
>> DEFINE_COMPILERRT_FUNCTION(___chkstk)
>>         push   %rcx
>>         cmp    $0x1000,%rax
>>         lea    16(%rsp),%rcx     // rsp before calling this routine
-> rcx
>>         jb     1f
>> 2:
>>         sub    $0x1000,%rcx
>>         test   %rcx,(%rcx)
>>         sub    $0x1000,%rax
>>         cmp    $0x1000,%rax
>>         ja     2b
>> 1:
>>         sub    %rax,%rcx
>>         test   %rcx,(%rcx)
>>
>>         lea    8(%rsp),%rax     // load pointer to the return address
>> into rax
>>         mov    %rcx,%rsp        // install the new top of stack pointer
>> into rsp
>>         mov    -8(%rax),%rcx    // restore rcx
>>         push   (%rax)           // push return address onto the stack
>>         sub    %rsp,%rax        // restore the original value in rax
>>         ret
>> END_COMPILERRT_FUNCTION(___chkstk)
>>
>> (source https://github.com/llvm-project/llvm-project-2017050
>> 7/blob/release_50/compiler-rt/lib/builtins/x86_64/chkstk2.S)
>>
>> When I replaced it with a simple `ret`, everything worked.
>>
>> The disassembled ntdll implementation is:
>>
>> __chkstk:
>> 1800a9f60:  48 83 ec 10     subq    $16, %rsp
>> 1800a9f64:  4c 89 14 24     movq    %r10, (%rsp)
>> 1800a9f68:  4c 89 5c 24 08  movq    %r11, 8(%rsp)
>> 1800a9f6d:  4d 33 db    xorq    %r11, %r11
>> 1800a9f70:  4c 8d 54 24 18  leaq    24(%rsp), %r10
>> 1800a9f75:  4c 2b d0    subq    %rax, %r10
>> 1800a9f78:  4d 0f 42 d3     cmovbq  %r11, %r10
>> 1800a9f7c:  65 4c 8b 1c 25 10 00 00 00  movq    %gs:16, %r11
>> 1800a9f85:  4d 3b d3    cmpq    %r11, %r10
>> 1800a9f88:  73 15   jae 21 <__chkstk+0x3F>
>> 1800a9f8a:  66 41 81 e2 00 f0   andw    $61440, %r10w
>> 1800a9f90:  4d 8d 9b 00 f0 ff ff    leaq    -4096(%r11), %r11
>> 1800a9f97:  45 84 1b    testb   (%r11), %r11b
>> 1800a9f9a:  4d 3b d3    cmpq    %r11, %r10
>> 1800a9f9d:  75 f1   jne -15 <__chkstk+0x30>
>> 1800a9f9f:  4c 8b 14 24     movq    (%rsp), %r10
>> 1800a9fa3:  4c 8b 5c 24 08  movq    8(%rsp), %r11
>> 1800a9fa8:  48 83 c4 10     addq    $16, %rsp
>>
>> On Mon, Oct 2, 2017 at 1:37 PM, Reid Kleckner <rnk at google.com>
wrote:
>>
>>> Can you post test.obj somewhere, and maybe the LLVM IR if you can
get
>>> it? If it really was reading address 0xFFFFFFFFFFFFFFFF, then RBP
must
>>> have been completely corrupted, probably by the prologue.
>>>
>>> On Sat, Sep 30, 2017 at 6:27 PM, Andrew Kelley via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> I suspect that there are 2 issues here:
>>>>
>>>>  * I have incorrect alignment somewhere
>>>>  * MSVC / .pdb / CodeView debugging is not working correctly.
>>>>
>>>> I think the latter would help solve the former.
>>>>
>>>> I will send out a new email later talking about the issues
I'm having
>>>> debugging llvm-generated binaries with MSVC.
>>>>
>>>> On Sat, Sep 30, 2017 at 3:33 PM, Andrew Kelley <superjoe30
at gmail.com>
>>>> wrote:
>>>>
>>>>> I have this code, which works fine on MacOS and Linux
hosts:
>>>>>
>>>>>     const char *target_specific_cpu_args;
>>>>>     const char *target_specific_features;
>>>>>     if (g->is_native_target) {
>>>>>         target_specific_cpu_args = ZigLLVMGetHostCPUName();
>>>>>         target_specific_features =
ZigLLVMGetNativeFeatures();
>>>>>     } else {
>>>>>         target_specific_cpu_args = "";
>>>>>         target_specific_features = "";
>>>>>     }
>>>>>
>>>>>     g->target_machine =
LLVMCreateTargetMachine(target_ref,
>>>>> buf_ptr(&g->triple_str),
>>>>>             target_specific_cpu_args,
target_specific_features,
>>>>> opt_level, reloc_mode, LLVMCodeModelDefault);
>>>>>
>>>>>
>>>>>
>>>>> char *ZigLLVMGetHostCPUName(void) {
>>>>>     std::string str = sys::getHostCPUName();
>>>>>     return strdup(str.c_str());
>>>>> }
>>>>>
>>>>> char *ZigLLVMGetNativeFeatures(void) {
>>>>>     SubtargetFeatures features;
>>>>>
>>>>>     StringMap<bool> host_features;
>>>>>     if (sys::getHostCPUFeatures(host_features)) {
>>>>>         for (auto &F : host_features)
>>>>>             features.AddFeature(F.first(), F.second);
>>>>>     }
>>>>>
>>>>>     return strdup(features.getString().c_str());
>>>>> }
>>>>>
>>>>> On this windows laptop that I am testing on, I get these
values:
>>>>>
>>>>> target_specific_cpu_args: skylake
>>>>>
>>>>> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
>>>>>
avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
>>>>>
+xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-p
>>>>>
ku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsav
>>>>>
e,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+ss
>>>>>
e4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+f16c,+s
>>>>>
sse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-sha,+adx,-avx512pf,+sse3
>>>>>
>>>>>
>>>>> It successfully creates a binary, but the binary when run
crashes with:
>>>>>
>>>>> Unhandled exception at 0x00007FF7C9913BA7 in test.exe:
0xC0000005:
>>>>> Access violation reading location 0xFFFFFFFFFFFFFFFF.
>>>>>
>>>>> The disassembly of the crashed instruction is:
>>>>>
>>>>> 00007FF7C9913BA7  vmovdqa     xmmword ptr [rbp-20h],xmm0
>>>>>
>>>>> There is no callstack or source in the MSVC debugger. The
.pdb
>>>>> produced is 64KB exactly. The file was linked with:
>>>>>
>>>>> lld -NOLOGO -DEBUG -MACHINE:X64 /SUBSYSTEM:console
-OUT:.\test.exe
>>>>> -NODEFAULTLIB -ENTRY:_start ./zig-cache/test.obj
./zig-cache/builtin.obj
>>>>> ./zig-cache/compiler_rt.obj ./zig-cache/kernel32.lib
>>>>>
>>>>>
>>>>> When I change the call to LLVMCreateTargetMachine so that
both
>>>>> target_specific_cpu_args  and target_specific_features  are
the empty
>>>>> string, the produced binary is valid and runs successfully.
>>>>>
>>>>> Is this an LLVM bug? Am I using the API incorrectly? Is
there more
>>>>> information I can provide to LLVM-dev mailing list that
would make it
>>>>> easier to help me?
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171003/5da2ca14/attachment.html>

llvm dev - Oct 2017 - invalid code generated on Windows x86_64 using skylake-specific features

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features

[llvm-dev] invalid code generated on Windows x86_64 using skylake-specific features