thr3ads.net - search: "sandybridge"

Displaying 20 results from an estimated 216 matches for "sandybridge".

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

2018 Mar 15

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

...e ‘implementation’ section of the RFC. For people familiar with the work of Agner Fog, this is essentially an automation of the process of building the code snippets using instruction descriptions from LLVM. Results - Solving this bug <https://bugs.llvm.org/show_bug.cgi?id=36084> (sandybridge): > llvm-exegesis -opcode-name IMUL16rri8 -benchmark-mode latency --- asm_template: name: latency IMUL16rri8 cpu_name: sandybridge llvm_triple: x86_64-grtev4-linux-gnu num_repetitions: 10000 measurements: - { key: latency, value: 4.0115, debug_string: ''...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

Hi, Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execut...

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

2018 Mar 15

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

...e familiar with the work of Agner Fog, this is essentially an > automation of the process of building the code snippets using > instruction descriptions from LLVM. > > > Results > > * > > Solving this bug > <https://bugs.llvm.org/show_bug.cgi?id=36084>(sandybridge): > > > llvm-exegesis -opcode-name IMUL16rri8 -benchmark-mode latency > > --- > > asm_template: > > name: latency IMUL16rri8 > > cpu_name: sandybridge > > llvm_triple: x86_64-grtev4-linux-gnu > > num_repetitions: 10000 > &gt...

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

2018 Mar 15

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

...; For people familiar with the work of Agner Fog, this is essentially an > automation of the process of building the code snippets using instruction > descriptions from LLVM. > Results > > - > > Solving this bug <https://bugs.llvm.org/show_bug.cgi?id=36084> > (sandybridge): > > > llvm-exegesis -opcode-name IMUL16rri8 -benchmark-mode latency > > --- > > asm_template: > > name: latency IMUL16rri8 > > cpu_name: sandybridge > > llvm_triple: x86_64-grtev4-linux-gnu > > num_repetitions: 10000 > > me...

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

2014 Mar 03

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

...>>> I've just managed to install the libvirt-bin:amd64 package on the same >>>>>> machine, on the wheezy-i386 distribution. The output of 'virsh >>>>>> capabilities' is now reported correctly, and my VMs that required >>>>>> SandyBridge are now booting. >>>>>> >>>>>> Have you any further suggestions? >>>>>> >>>>> Oh, I missed the fact that it was 32 bit distro. In that case, I >>>>> guess some cpu features won't be available. No other ideas....

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

2018 Mar 15

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

...familiar with the work of Agner Fog, this is essentially an > automation of the process of building the code snippets using > instruction descriptions from LLVM. > > > Results > > * > > Solving this bug > <https://bugs.llvm.org/show_bug.cgi?id=36084>(sandybridge): > > > llvm-exegesis -opcode-name IMUL16rri8 -benchmark-mode latency > > --- > > asm_template: > > name: latency IMUL16rri8 > > cpu_name: sandybridge > > llvm_triple: x86_64-grtev4-linux-gnu > > num_repetitions: 10000 > > me...

Regression in stable for ThinkPad T520 with Intel GPU (Sandybridge) between June 22 and July 18

2012 Jul 25

Regression in stable for ThinkPad T520 with Intel GPU (Sandybridge) between June 22 and July 18

I will shortly spend a bit of time tracking down the breakage more closely, but my 9-Stable system of June 22 runs fine. After an update on about July 10, I noted that it would hang after Xorg was started, but usually worked. After an upgrade to July 18, my system could no longer start Gnome. It would start Xorg and Gnome would start normally, getting many apps started, but about 10 seconds after

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

...e difference disappear. I will try to narrow > down what is going on and if it seems related LLVM, I will post an example. > Thanks again, > > Zach > > > On Tue, Jul 9, 2013 at 10:15 PM, Nadav Rotem <nrotem at apple.com> wrote: > >> Hi, >> >> Yes. On Sandybridge 256-bit loads/stores are double pumped. This means >> that they go in one after the other in two cycles. On Haswell the memory >> ports are wide enough to allow a 256bit memory operation in one cycle. So, >> on Sandybridge we split unaligned memory operations into two 128bit pa...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

...ting this part of the kernel makes the performance difference disappear. I will try to narrow down what is going on and if it seems related LLVM, I will post an example. Thanks again, Zach On Tue, Jul 9, 2013 at 10:15 PM, Nadav Rotem <nrotem at apple.com> wrote: > Hi, > > Yes. On Sandybridge 256-bit loads/stores are double pumped. This means > that they go in one after the other in two cycles. On Haswell the memory > ports are wide enough to allow a 256bit memory operation in one cycle. So, > on Sandybridge we split unaligned memory operations into two 128bit parts > to...

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

2014 Mar 03

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

...#39;ve just managed to install the libvirt-bin:amd64 package on the same > >>>>>> machine, on the wheezy-i386 distribution. The output of 'virsh > >>>>>> capabilities' is now reported correctly, and my VMs that required > >>>>>> SandyBridge are now booting. > >>>>>> > >>>>>> Have you any further suggestions? > >>>>>> > >>>>> Oh, I missed the fact that it was 32 bit distro. In that case, I > >>>>> guess some cpu features won't be ava...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Sep 19

[LLVMdev] unaligned AVX store gets split into two instructions

...w >> down what is going on and if it seems related LLVM, I will post an example. >> Thanks again, >> >> Zach >> >> >> On Tue, Jul 9, 2013 at 10:15 PM, Nadav Rotem <nrotem at apple.com> wrote: >> >>> Hi, >>> >>> Yes. On Sandybridge 256-bit loads/stores are double pumped. This means >>> that they go in one after the other in two cycles. On Haswell the memory >>> ports are wide enough to allow a 256bit memory operation in one cycle. So, >>> on Sandybridge we split unaligned memory operations into t...

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

2018 Mar 15

[RFC] llvm-exegesis: Automatic Measurement of Instruction Latency/Uops

...ntially an automation of the process of building the code >> snippets using instruction descriptions from LLVM. >> >> >> Results >> >> * >> >> Solving this bug >> <https://bugs.llvm.org/show_bug.cgi?id=36084>(sandybridge): >> >> > llvm-exegesis -opcode-name IMUL16rri8 -benchmark-mode latency >> >> --- >> >> asm_template: >> >> name: latency IMUL16rri8 >> >> cpu_name: sandybridge >> >> llvm_tripl...

libvirt does not show same CPU Model as /proc/cpuinfo for CPU Model info.

2017 Jan 28

libvirt does not show same CPU Model as /proc/cpuinfo for CPU Model info.

...--name compute-0 --cpu Haswell,+fma,+movbe,+fsgsbase,+bmi1,+hle,+avx2,+smep,+bmi2,+erms,+invpcid,+rtm --ram=61440 --vcpus=20 --os-type=linux --os-variant=generic After guest installation /proc/cpuinfo show model name as Haswell . However Libvirt virsh capabilities show CPU configuration as "SandyBridge . " 1. Could I get explanation of the why inconsistency . 2. Is this expected behaviour 3. Why libvirt does not show same CPU Model as /proc/cpuinfo for CPU Model info. My Aim : Nested Virtualization and Nested Guest needs CPU model as Haswell strictly . 4. IS it possible alternative to s...

[LLVMdev] AVX broadcast Vs. vector constant pool load

2012 Nov 07

[LLVMdev] AVX broadcast Vs. vector constant pool load

Hey guys, I'm currently investigating broadcasts from the constant pool on Sandy Bridge. I see this comment in llvm/lib/Target/X86/X86ISelLowering.cpp: // Handle the broadcasting a single constant scalar from the constant pool // into a vector. On Sandybridge it is still better to load a constant vector // from the constant pool and not to broadcast it from a scalar. Would anyone be able to explain why it is better to load a vector from the constant pool rather than broadcast a scalar? I checked out Agner Fog's tables, but it wasn't so obvi...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 22

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

...4c38..d221316 100644 --- a/lib/Target/X86/X86Subtarget.cpp +++ b/lib/Target/X86/X86Subtarget.cpp @@ -285,7 +285,8 @@ void X86Subtarget::AutoDetectSubtargetFeatures() { (Family == 6 && Model == 0x2F) || // Westmere: Westmere-EX (Family == 6 && Model == 0x2A) || // SandyBridge (Family == 6 && Model == 0x2D) || // SandyBridge: SandyBridge-E* - (Family == 6 && Model == 0x3A))) {// IvyBridge + (Family == 6 && Model == 0x3A) || // IvyBridge + (Family == 6 && Model == 0x3C))) {// Haswell IsUAMemFast = tr...

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

...same stats as vmovaps, so I feel it is a safe assumption to make that vmovsd has the same stats as well. Michael On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote: > Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. > > I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch. > > -Cameron > >...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

On Tue, Jul 9, 2013 at 9:01 PM, Zach Devito <zdevito at gmail.com> wrote: > I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads > on AVX. > 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a > single instruction (details below). > In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which > seems to be

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

...ill change the codegen of several SPECCPU benchmarks: Code size: 447.dealII 0.50% 453.povray 0.42% 433.milc 0.20% 445.gobmk 0.32% 403.gcc 0.05% 464.h264ref 3.62% Compile Time: 447.dealII 0.22% 453.povray -0.16% 433.milc 0.09% 445.gobmk -2.43% 403.gcc 0.06% 464.h264ref 3.21% Performance (on intel sandybridge): 447.dealII +0.07% 453.povray +1.79% 433.milc +1.02% 445.gobmk +0.56% 403.gcc -0.16% 464.h264ref -0.41% Looks like the change has overall positive performance impact with very small code size/compile time overhead. Now the question is shall we make this change default in O2, or shall we leave it...

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch. -Cameron On Thu, Jul 26, 2012 at 2:27...

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

2014 Mar 03

Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)

...id >>>> >>>> I've just managed to install the libvirt-bin:amd64 package on the same >>>> machine, on the wheezy-i386 distribution. The output of 'virsh >>>> capabilities' is now reported correctly, and my VMs that required >>>> SandyBridge are now booting. >>>> >>>> Have you any further suggestions? >>>> >>> Oh, I missed the fact that it was 32 bit distro. In that case, I >>> guess some cpu features won't be available. No other ideas. >> Ok thanks Martin. If the amd64...

search for: sandybridge