thr3ads.net - search: "ebizzy"

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 07

2

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...y still be race conditions where the current code is not handling correctly. I will look into that to see where the problem is. BTW, what test do you use to produce the hang condition? > > Patch series with that change gave around 20% improvement for dbench > 2x and 30% improvement for ebizzy 2x cases. (1x has no significant > loss/gain). > > What is the baseline for the performance improvement? Is it without the unfair lock and PV qspinlock? -Longman

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 07

2

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...y still be race conditions where the current code is not handling correctly. I will look into that to see where the problem is. BTW, what test do you use to produce the hang condition? > > Patch series with that change gave around 20% improvement for dbench > 2x and 30% improvement for ebizzy 2x cases. (1x has no significant > loss/gain). > > What is the baseline for the performance improvement? Is it without the unfair lock and PV qspinlock? -Longman

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 08

1

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...at the code on halt() related changes. >> >> It seems like there may still be race conditions where the current code >> is not handling correctly. I will look into that to see where the >> problem is. BTW, what test do you use to produce the hang condition? > > Running ebizzy on 2 of the vms simultaneously (for sometime in > repeated loop) could reproduce it. > Yes, I am able to reproduce the hang problem with ebizzy. BTW, could you try to apply the attached patch file on top of the v8 patch series to see if it can fix the hang problem? > >> What is...

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 08

1

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...at the code on halt() related changes. >> >> It seems like there may still be race conditions where the current code >> is not handling correctly. I will look into that to see where the >> problem is. BTW, what test do you use to produce the hang condition? > > Running ebizzy on 2 of the vms simultaneously (for sometime in > repeated loop) could reproduce it. > Yes, I am able to reproduce the hang problem with ebizzy. BTW, could you try to apply the attached patch file on top of the v8 patch series to see if it can fix the hang problem? > >> What is...

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

2014 May 07

0

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

...e same 20 physical CPUs (200% overcommit) 3) Both VMs are active and they shares 30 physical CPUs (10 delicated and 10 shared - 133% overcommit) The tests run included the disk workload of the AIM7 benchmark on both ext4 and xfs RAM disks at 3000 users on a 3.15-rc1 based kernel. The "ebizzy -m" test was was also run and its performance data were recorded. With two VMs running, the "idle=poll" kernel option was added to simulate a busy guest. The entry "unfair + PV qspinlock" below means that both the unfair lock and PV spinlock configuration options were turn...

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 07

0

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...;> take a closer look at the code on halt() related changes. > > It seems like there may still be race conditions where the current code > is not handling correctly. I will look into that to see where the > problem is. BTW, what test do you use to produce the hang condition? Running ebizzy on 2 of the vms simultaneously (for sometime in repeated loop) could reproduce it. >> Patch series with that change gave around 20% improvement for dbench >> 2x and 30% improvement for ebizzy 2x cases. (1x has no significant >> loss/gain). >> While at it, Just a correcti...

[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2014 Oct 29

0

[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

...in one of the following three configurations: 1) Only 1 VM is active 2) Both VMs are active and they share the same 20 physical CPUs (200% overcommit) The tests run included the disk workload of the AIM7 benchmark on both ext4 and xfs RAM disks at 3000 users on a 3.17 based kernel. The "ebizzy -m" test and futextest was was also run and its performance data were recorded. With two VMs running, the "idle=poll" kernel option was added to simulate a busy guest. If PV qspinlock is not enabled, unfairlock will be used automically in a guest. AIM7 XFS Disk Test...

[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2014 Oct 29

0

[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

...in one of the following three configurations: 1) Only 1 VM is active 2) Both VMs are active and they share the same 20 physical CPUs (200% overcommit) The tests run included the disk workload of the AIM7 benchmark on both ext4 and xfs RAM disks at 3000 users on a 3.17 based kernel. The "ebizzy -m" test and futextest was was also run and its performance data were recorded. With two VMs running, the "idle=poll" kernel option was added to simulate a busy guest. If PV qspinlock is not enabled, unfairlock will be used automically in a guest. AIM7 XFS Disk Test...

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

2014 May 07

1

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

...PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n > (queue spinlock without paravirt) > > C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y > PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n > (queue spinlock with paravirt) Could you do s/PRAVIRT/PARAVIRT/ please? > > Ebizzy %improvements > ==================== > overcommit A B C > 0.5x 4.4265 2.0611 1.5824 > 1.0x 0.9015 -7.7828 4.5443 > 1.5x 46.1162 -2.9845 -3.5046 > 2.0x 99.8150 -2.7116 4.7461 Considering B sucks &...

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

2014 May 07

1

[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM

...PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n > (queue spinlock without paravirt) > > C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y > PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n > (queue spinlock with paravirt) Could you do s/PRAVIRT/PARAVIRT/ please? > > Ebizzy %improvements > ==================== > overcommit A B C > 0.5x 4.4265 2.0611 1.5824 > 1.0x 0.9015 -7.7828 4.5443 > 1.5x 46.1162 -2.9845 -3.5046 > 2.0x 99.8150 -2.7116 4.7461 Considering B sucks &...

[RFC] Implement Batched (group) ticket lock

2014 May 28

7

[RFC] Implement Batched (group) ticket lock

...ze of 4. (As we know increasing batch size means we are closer to unfair locks and batch size of 1 = ticketlock). Result: Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht). Guests: 8GB 16vcpu guests (average of 8 iterations) % Improvements with kvm guests (batch size = 4): ebizzy_0.5x 4.3 ebizzy_1.0x 7.8 ebizzy_1.5x 23.4 ebizzy_2.0x 48.6 Baremetal: ebizzy showed very high stdev, kernbench result was good but both of them did not show much difference. ebizzy: rec/sec higher is better base 50452 patched 50703 kernbench time in sec (lesser is better) base...

[RFC] Implement Batched (group) ticket lock

2014 May 28

7

[RFC] Implement Batched (group) ticket lock

...ze of 4. (As we know increasing batch size means we are closer to unfair locks and batch size of 1 = ticketlock). Result: Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht). Guests: 8GB 16vcpu guests (average of 8 iterations) % Improvements with kvm guests (batch size = 4): ebizzy_0.5x 4.3 ebizzy_1.0x 7.8 ebizzy_1.5x 23.4 ebizzy_2.0x 48.6 Baremetal: ebizzy showed very high stdev, kernbench result was good but both of them did not show much difference. ebizzy: rec/sec higher is better base 50452 patched 50703 kernbench time in sec (lesser is better) base...

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 07

0

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

...the code > to make more efficient use of the lock or finer granularity ones. The > main purpose is to make the lock contention problems more tolerable > until someone can spend the time and effort to fix them. > > To illustrate the performance benefit of the queue spinlock, the > ebizzy benchmark was run with the -m option in two different computers: > > Test machine ticket-lock queue-lock > ------------ ----------- ---------- > 4-socket 40-core 2316 rec/s 2899 rec/s > Westmere-EX (HT off) > 2-socket 12-core 2130 rec/s 2176 rec/s > West...

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 27

0

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

...y (unfair lock) B = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n (queue spinlock without paravirt) C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n (queue spinlock with paravirt) Ebizzy % improvements ======================== overcommit A B C 0.5x 4.4265 2.0611 1.5824 1.0x 0.9015 -7.7828 4.5443 1.5x 46.1162 -2.9845 -3.5046 2.0x 99.8150 -2.7116 4.7461 Dbench %improvements overcommit A B C 0.5x 3.2617 3.5436 2.5676 1.0x 0.6302 2.2342...

[PATCH v10 10/19] qspinlock, x86: Allow unfair spinlock in a virtual guest

2014 May 07

0

[PATCH v10 10/19] qspinlock, x86: Allow unfair spinlock in a virtual guest

...ng lock acquirers will have to wait in the queue in FIFO order. This cannot completely solve the lock waiter preemption problem, but it does help to alleviate the impact of this problem. To illustrate the performance impact of the various approaches, the disk workload of the AIM7 benchmark and the ebizzy test were run on a 4-socket 40-core Westmere-EX system (bare metal, HT off, ramdisk) on a 3.14 based kernel. The table below shows the performance of the different kernel flavors. AIM7 XFS Disk Test kernel JPM Real Time Sys Time Usr Time -----...

[RFC] Implement Batched (group) ticket lock

2014 May 29

0

[RFC] Implement Batched (group) ticket lock

...: > > TODO: > - we need an intelligent way to nullify the effect of batching for baremetal > (because extra cmpxchg is not required). To do this, you will need to have 2 slightly different algorithms depending on the paravirt_ticketlocks_enabled jump label. > > - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to > show the impact of extra cmpxchg. but there should be effect of extra cmpxchg. It will depend on the micro-benchmark and the test system used. I had seen the a test case that extra cmpxchg did not really impact performance on a W...

[RFC] Implement Batched (group) ticket lock

2014 May 28

0

[RFC] Implement Batched (group) ticket lock

...extra cmpxchg at times. > - we may have to make batch size as kernel arg to solve above problem > (to run same kernel for host/guest). Increasing batch size also seem to help > virtualized guest more, so we will have flexibility of tuning depending on vm size. > > - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to > show the impact of extra cmpxchg. but there should be effect of extra cmpxchg. Canceled out by better NUMA locality? Or maybe cmpxchg is cheap once you already own the cache line exclusively? > - virtualized guest had slight im...

[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014 Oct 29

15

[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support

v12->v13: - Change patch 9 to generate separate versions of the queue_spin_lock_slowpath functions for bare metal and PV guest. This reduces the performance impact of the PV code on bare metal systems. v11->v12: - Based on PeterZ's version of the qspinlock patch (https://lkml.org/lkml/2014/6/15/63). - Incorporated many of the review comments from Konrad Wilk and Paolo

[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014 Oct 29

15

[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support

v12->v13: - Change patch 9 to generate separate versions of the queue_spin_lock_slowpath functions for bare metal and PV guest. This reduces the performance impact of the PV code on bare metal systems. v11->v12: - Based on PeterZ's version of the qspinlock patch (https://lkml.org/lkml/2014/6/15/63). - Incorporated many of the review comments from Konrad Wilk and Paolo

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 17

33

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

v8->v9: - Integrate PeterZ's version of the queue spinlock patch with some modification: http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org - Break the more complex patches into smaller ones to ease review effort. - Fix a racing condition in the PV qspinlock code. v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving

search for: ebizzy