thr3ads.net - search: "ldmxcsr"

2014 Jan 28

2

[LLVMdev] ldmxcsr reordering issue

Hi, I met troubles with jitting x86 codes when using Intrinsic::x86_sse_ldmxcsr. The target code must execute some SSE2 instruction with DAZ/FTZ modes enabled and others with DAZ/FTZ disabled. I'm trying to get this by emitting LDMXCSR instructions with proper flag words. It appeared however that execution engine sometimes reorders these instructions with computational one...

llc generating code that writes below the stack pointer on darwin/x86-64

2016 Nov 13

2

llc generating code that writes below the stack pointer on darwin/x86-64

...rd); %tmp.1 = alloca i32, align 4 ; Var w located at %tmp.1 ; [72] begin store i32 %p.w, i32* %tmp.1, align 4 ; [73] defaultmxcsr:=w; %reg.1_16 = load i32, i32* %tmp.1, align 4 store i32 %reg.1_16, i32* @"\01_TC_$SYSTEM_$$_DEFAULTMXCSR", align 4 ; [75] ldmxcsr w call void asm sideeffect "# [math.inc]\0A\09ldmxcsr\09$0\0A","=*m,~{memory},~{fpsr},~{flags},~{rax},~{rcx},~{rdx},~{rsi},~{rdi},~{r8},~{r9},~{r10},~{r11}"(i32* %tmp.1) ; [77] end; br label %Lj1135 Lj1135: ret void } @"\01_TC_$SYSTEM_$$_DEFAUL...

High CPU usage

2012 Jun 14

1

High CPU usage

...el.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz/ __m128 state[32]; __int32 temp; __asm fxsave state; memcpy(&temp, (char*)state + 24, 4); temp |= (1 << 11); // UNDERFLOW_EXCEPTION_MASK temp |= (1 << 15); // FTZ_BIT __asm ldmxcsr temp; Tested with Visual Studio 2010 on x86. Please let me know if it works for you too. Mark -- NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a ------------------------------ _____...

High CPU usage

2012 Jun 13

0

High CPU usage

...el.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz/ __m128 state[32]; __int32 temp; __asm fxsave state; memcpy(&temp, (char*)state + 24, 4); temp |= (1 << 11); // UNDERFLOW_EXCEPTION_MASK temp |= (1 << 15); // FTZ_BIT __asm ldmxcsr temp; Tested with Visual Studio 2010 on x86. Please let me know if it works for you too. Mark -- NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

[LLVMdev] LLVM floating point rounding modes

2011 Jul 09

1

[LLVMdev] LLVM floating point rounding modes

Hi, I am not sure if this is the right mailing list to ask my question, if not, please refer me to the proper one. Is there any support for rounding modes in LLVM floating point? I looked in the assembler reference manual, and it doesn't seem so. I am thinking about choosing LLVM as one of the backends for my programming language Babel-17 (www.babel-17.com). Babel-17 features interval

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 20

4

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

...ore or less pretend that floating point status bits don’t exist (at least before you get to the target-specific backend). You’ll find that the X86 backend doesn’t even model MXCSR at the moment. I tried to add it recently and it kind of blew up before I had even modeled it for anything other than LDMXCSR and STMCXSR. We may want to address that at some point, but right now it just isn’t there. When we discussed how FENV_ACCESS support should be implemented, Chandler proposed that when restricted modes (whether FENV_ACCESS or any other front end-specific analogous behavior) were not being used the...

Wine release 5.16

2020 Aug 28

0

Wine release 5.16

...thread data cleanup. Paul Gofman (17): ntdll: Report newer vector processor features on x86 / x64. ntdll: Don't transfer xmm registers explicitly during context save and restore on x64. include: Update _XSTATE_CONFIGURATION structure definition. ntdll: Remove redundant ldmxcsr in set_full_cpu_context() on x86_64. include: Define _XSAVE_FORMAT structure. include: Define extended context structures. include: Implement __cpuidex() function. wineboot: Initialize XState features in user_shared_data. kernel32: Implement GetEnabledXStateFeatures()....

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 13

2

[LLVMdev] trunk's optimizer generates slower code than 3.5

...push esi push edi push ebx sub esp, 74h push 3 call sub_4080F0 add esp, 4 stmxcsr [esp+80h+var_80] or [esp+80h+var_80], 8000h ldmxcsr [esp+80h+var_80] cmp [ebp+argc], 2 jz short loc_40103A mov eax, 0FFFFFFFFh add esp, 74h pop ebx pop edi pop esi mov esp, ebp...

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

2

[LLVMdev] trunk's optimizer generates slower code than 3.5

...ebx >> sub esp, 74h >> push 3 >> call sub_4080F0 >> add esp, 4 >> stmxcsr [esp+80h+var_80] >> or [esp+80h+var_80], 8000h >> ldmxcsr [esp+80h+var_80] >> cmp [ebp+argc], 2 >> jz short loc_40103A >> mov eax, 0FFFFFFFFh >> add esp, 74h >> pop ebx >> pop edi >>...

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

2

[LLVMdev] trunk's optimizer generates slower code than 3.5

...>>>> push 3 >>>> call sub_4080F0 >>>> add esp, 4 >>>> stmxcsr [esp+80h+var_80] >>>> or [esp+80h+var_80], 8000h >>>> ldmxcsr [esp+80h+var_80] >>>> cmp [ebp+argc], 2 >>>> jz short loc_40103A >>>> mov eax, 0FFFFFFFFh >>>> add esp, 74h >>>> pop ebx >>>>...

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 19

3

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

Changing the list from cfe-dev to llvm-dev > On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote: > > I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI. > > I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 21

2

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

...ly float and double to unsigned need to round or clamp negative values to zero. I guess this is undefined behaviour in C. You’ll find that the X86 backend doesn’t even model MXCSR at the moment. I tried to add it recently and it kind of blew up before I had even modeled it for anything other than LDMXCSR and STMCXSR. We may want to address that at some point, but right now it just isn’t there. When we discussed how FENV_ACCESS support should be implemented, Chandler proposed that when restricted modes (whether FENV_ACCESS or any other front end-specific analogous behavior) were not being used the...

[PATCH v5 00/75] x86: SEV-ES Guest Support

2020 Jul 24

86

[PATCH v5 00/75] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is a rebased version of the latest SEV-ES patches. They are now based on latest tip/master instead of upstream Linux and include the necessary changes. Changes to v4 are in particular: - Moved early IDT setup code to idt.c, because the idt_descr and the idt_table are now static - This required to make stack protector work early (or

[PATCH v4 00/75] x86: SEV-ES Guest Support

2020 Jul 14

92

[PATCH v4 00/75] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is the fourth version of the SEV-ES Guest Support patches. I addressed the review comments sent to me for the previous version and rebased the code v5.8-rc5. The biggest change in this version is the IST handling code for the #VC handler. I adapted the entry code for the #VC handler to the big pile of entry code changes merged into

[PATCH v4 00/75] x86: SEV-ES Guest Support

2020 Jul 14

92

[PATCH v4 00/75] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is the fourth version of the SEV-ES Guest Support patches. I addressed the review comments sent to me for the previous version and rebased the code v5.8-rc5. The biggest change in this version is the IST handling code for the #VC handler. I adapted the entry code for the #VC handler to the big pile of entry code changes merged into

[PATCH v6 00/76] x86: SEV-ES Guest Support

2020 Aug 24

96

[PATCH v6 00/76] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is the new version of the SEV-ES client enabling patch-set. It is based on the latest tip/master branch and contains the necessary changes. In particular those ar: - Enabling CR4.FSGSBASE early on supported processors so that early #VC exceptions on APs can be handled. - Add another patch (patch 1) to fix a KVM frame-size build

[PATCH v7 00/72] x86: SEV-ES Guest Support

2020 Sep 07

84

[PATCH v7 00/72] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is a new version of the SEV-ES Guest Support patches for x86. The previous versions can be found as a linked list starting here: https://lore.kernel.org/lkml/20200824085511.7553-1-joro at 8bytes.org/ I updated the patch-set based on ther review comments I got and the discussions around it. Another important change is that the early IDT

[PATCH v7 00/72] x86: SEV-ES Guest Support

2020 Sep 07

84

[PATCH v7 00/72] x86: SEV-ES Guest Support

From: Joerg Roedel <jroedel at suse.de> Hi, here is a new version of the SEV-ES Guest Support patches for x86. The previous versions can be found as a linked list starting here: https://lore.kernel.org/lkml/20200824085511.7553-1-joro at 8bytes.org/ I updated the patch-set based on ther review comments I got and the discussions around it. Another important change is that the early IDT

search for: ldmxcsr