Displaying 20 results from an estimated 63 matches for "ebizzy".
2014 Apr 07
2
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...y still be race conditions where the current code
is not handling correctly. I will look into that to see where the
problem is. BTW, what test do you use to produce the hang condition?
>
> Patch series with that change gave around 20% improvement for dbench
> 2x and 30% improvement for ebizzy 2x cases. (1x has no significant
> loss/gain).
>
>
What is the baseline for the performance improvement? Is it without the
unfair lock and PV qspinlock?
-Longman
2014 Apr 07
2
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...y still be race conditions where the current code
is not handling correctly. I will look into that to see where the
problem is. BTW, what test do you use to produce the hang condition?
>
> Patch series with that change gave around 20% improvement for dbench
> 2x and 30% improvement for ebizzy 2x cases. (1x has no significant
> loss/gain).
>
>
What is the baseline for the performance improvement? Is it without the
unfair lock and PV qspinlock?
-Longman
2014 Apr 08
1
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...at the code on halt() related changes.
>>
>> It seems like there may still be race conditions where the current code
>> is not handling correctly. I will look into that to see where the
>> problem is. BTW, what test do you use to produce the hang condition?
>
> Running ebizzy on 2 of the vms simultaneously (for sometime in
> repeated loop) could reproduce it.
>
Yes, I am able to reproduce the hang problem with ebizzy. BTW, could you
try to apply the attached patch file on top of the v8 patch series to
see if it can fix the hang problem?
>
>> What is...
2014 Apr 08
1
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...at the code on halt() related changes.
>>
>> It seems like there may still be race conditions where the current code
>> is not handling correctly. I will look into that to see where the
>> problem is. BTW, what test do you use to produce the hang condition?
>
> Running ebizzy on 2 of the vms simultaneously (for sometime in
> repeated loop) could reproduce it.
>
Yes, I am able to reproduce the hang problem with ebizzy. BTW, could you
try to apply the attached patch file on top of the v8 patch series to
see if it can fix the hang problem?
>
>> What is...
2014 May 07
0
[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM
...e same 20 physical CPUs
(200% overcommit)
3) Both VMs are active and they shares 30 physical CPUs (10 delicated
and 10 shared - 133% overcommit)
The tests run included the disk workload of the AIM7 benchmark on both
ext4 and xfs RAM disks at 3000 users on a 3.15-rc1 based kernel. The
"ebizzy -m" test was was also run and its performance data were
recorded. With two VMs running, the "idle=poll" kernel option was
added to simulate a busy guest. The entry "unfair + PV qspinlock"
below means that both the unfair lock and PV spinlock configuration
options were turn...
2014 Apr 07
0
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...;> take a closer look at the code on halt() related changes.
>
> It seems like there may still be race conditions where the current code
> is not handling correctly. I will look into that to see where the
> problem is. BTW, what test do you use to produce the hang condition?
Running ebizzy on 2 of the vms simultaneously (for sometime in repeated
loop) could reproduce it.
>> Patch series with that change gave around 20% improvement for dbench
>> 2x and 30% improvement for ebizzy 2x cases. (1x has no significant
>> loss/gain).
>>
While at it, Just a correcti...
2014 Oct 29
0
[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM
...in one of the following three configurations:
1) Only 1 VM is active
2) Both VMs are active and they share the same 20 physical CPUs
(200% overcommit)
The tests run included the disk workload of the AIM7 benchmark on
both ext4 and xfs RAM disks at 3000 users on a 3.17 based kernel. The
"ebizzy -m" test and futextest was was also run and its performance
data were recorded. With two VMs running, the "idle=poll" kernel
option was added to simulate a busy guest. If PV qspinlock is not
enabled, unfairlock will be used automically in a guest.
AIM7 XFS Disk Test...
2014 Oct 29
0
[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM
...in one of the following three configurations:
1) Only 1 VM is active
2) Both VMs are active and they share the same 20 physical CPUs
(200% overcommit)
The tests run included the disk workload of the AIM7 benchmark on
both ext4 and xfs RAM disks at 3000 users on a 3.17 based kernel. The
"ebizzy -m" test and futextest was was also run and its performance
data were recorded. With two VMs running, the "idle=poll" kernel
option was added to simulate a busy guest. If PV qspinlock is not
enabled, unfairlock will be used automically in a guest.
AIM7 XFS Disk Test...
2014 May 07
1
[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM
...PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n
> (queue spinlock without paravirt)
>
> C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y
> PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n
> (queue spinlock with paravirt)
Could you do s/PRAVIRT/PARAVIRT/ please?
>
> Ebizzy %improvements
> ====================
> overcommit A B C
> 0.5x 4.4265 2.0611 1.5824
> 1.0x 0.9015 -7.7828 4.5443
> 1.5x 46.1162 -2.9845 -3.5046
> 2.0x 99.8150 -2.7116 4.7461
Considering B sucks
&...
2014 May 07
1
[PATCH v10 18/19] pvqspinlock, x86: Enable PV qspinlock PV for KVM
...PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n
> (queue spinlock without paravirt)
>
> C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y
> PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n
> (queue spinlock with paravirt)
Could you do s/PRAVIRT/PARAVIRT/ please?
>
> Ebizzy %improvements
> ====================
> overcommit A B C
> 0.5x 4.4265 2.0611 1.5824
> 1.0x 0.9015 -7.7828 4.5443
> 1.5x 46.1162 -2.9845 -3.5046
> 2.0x 99.8150 -2.7116 4.7461
Considering B sucks
&...
2014 May 28
7
[RFC] Implement Batched (group) ticket lock
...ze of 4. (As we know increasing batch size means we are
closer to unfair locks and batch size of 1 = ticketlock).
Result:
Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht).
Guests: 8GB 16vcpu guests (average of 8 iterations)
% Improvements with kvm guests (batch size = 4):
ebizzy_0.5x 4.3
ebizzy_1.0x 7.8
ebizzy_1.5x 23.4
ebizzy_2.0x 48.6
Baremetal:
ebizzy showed very high stdev, kernbench result was good but both of them did
not show much difference.
ebizzy: rec/sec higher is better
base 50452
patched 50703
kernbench time in sec (lesser is better)
base...
2014 May 28
7
[RFC] Implement Batched (group) ticket lock
...ze of 4. (As we know increasing batch size means we are
closer to unfair locks and batch size of 1 = ticketlock).
Result:
Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht).
Guests: 8GB 16vcpu guests (average of 8 iterations)
% Improvements with kvm guests (batch size = 4):
ebizzy_0.5x 4.3
ebizzy_1.0x 7.8
ebizzy_1.5x 23.4
ebizzy_2.0x 48.6
Baremetal:
ebizzy showed very high stdev, kernbench result was good but both of them did
not show much difference.
ebizzy: rec/sec higher is better
base 50452
patched 50703
kernbench time in sec (lesser is better)
base...
2014 Apr 07
0
[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support
...the code
> to make more efficient use of the lock or finer granularity ones. The
> main purpose is to make the lock contention problems more tolerable
> until someone can spend the time and effort to fix them.
>
> To illustrate the performance benefit of the queue spinlock, the
> ebizzy benchmark was run with the -m option in two different computers:
>
> Test machine ticket-lock queue-lock
> ------------ ----------- ----------
> 4-socket 40-core 2316 rec/s 2899 rec/s
> Westmere-EX (HT off)
> 2-socket 12-core 2130 rec/s 2176 rec/s
> West...
2014 Apr 27
0
[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support
...y (unfair lock)
B = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y
PRAVIRT_SPINLOCK = n PARAVIRT_UNFAIR_LOCKS = n (queue spinlock without
paravirt)
C = 3.15-rc2 + qspinlock v9 patch with QUEUE_SPINLOCK = y
PRAVIRT_SPINLOCK = y PARAVIRT_UNFAIR_LOCKS = n (queue spinlock with
paravirt)
Ebizzy % improvements
========================
overcommit A B C
0.5x 4.4265 2.0611 1.5824
1.0x 0.9015 -7.7828 4.5443
1.5x 46.1162 -2.9845 -3.5046
2.0x 99.8150 -2.7116 4.7461
Dbench %improvements
overcommit A B C
0.5x 3.2617 3.5436 2.5676
1.0x 0.6302 2.2342...
2014 May 07
0
[PATCH v10 10/19] qspinlock, x86: Allow unfair spinlock in a virtual guest
...ng lock acquirers will have to wait in the
queue in FIFO order. This cannot completely solve the lock waiter
preemption problem, but it does help to alleviate the impact of
this problem.
To illustrate the performance impact of the various approaches, the
disk workload of the AIM7 benchmark and the ebizzy test were run on
a 4-socket 40-core Westmere-EX system (bare metal, HT off, ramdisk)
on a 3.14 based kernel. The table below shows the performance
of the different kernel flavors.
AIM7 XFS Disk Test
kernel JPM Real Time Sys Time Usr Time
-----...
2014 May 29
0
[RFC] Implement Batched (group) ticket lock
...:
>
> TODO:
> - we need an intelligent way to nullify the effect of batching for baremetal
> (because extra cmpxchg is not required).
To do this, you will need to have 2 slightly different algorithms
depending on the paravirt_ticketlocks_enabled jump label.
>
> - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to
> show the impact of extra cmpxchg. but there should be effect of extra cmpxchg.
It will depend on the micro-benchmark and the test system used. I had
seen the a test case that extra cmpxchg did not really impact
performance on a W...
2014 May 28
0
[RFC] Implement Batched (group) ticket lock
...extra cmpxchg at times.
> - we may have to make batch size as kernel arg to solve above problem
> (to run same kernel for host/guest). Increasing batch size also seem to help
> virtualized guest more, so we will have flexibility of tuning depending on vm size.
>
> - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to
> show the impact of extra cmpxchg. but there should be effect of extra cmpxchg.
Canceled out by better NUMA locality?
Or maybe cmpxchg is cheap once you already own the cache line
exclusively?
> - virtualized guest had slight im...
2014 Oct 29
15
[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support
v12->v13:
- Change patch 9 to generate separate versions of the
queue_spin_lock_slowpath functions for bare metal and PV guest. This
reduces the performance impact of the PV code on bare metal systems.
v11->v12:
- Based on PeterZ's version of the qspinlock patch
(https://lkml.org/lkml/2014/6/15/63).
- Incorporated many of the review comments from Konrad Wilk and
Paolo
2014 Oct 29
15
[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support
v12->v13:
- Change patch 9 to generate separate versions of the
queue_spin_lock_slowpath functions for bare metal and PV guest. This
reduces the performance impact of the PV code on bare metal systems.
v11->v12:
- Based on PeterZ's version of the qspinlock patch
(https://lkml.org/lkml/2014/6/15/63).
- Incorporated many of the review comments from Konrad Wilk and
Paolo
2014 Apr 17
33
[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support
v8->v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org
- Break the more complex patches into smaller ones to ease review effort.
- Fix a racing condition in the PV qspinlock code.
v7->v8:
- Remove one unneeded atomic operation from the slowpath, thus
improving