Displaying 20 results from an estimated 27 matches for "lfsr".
Did you mean:
lfs
2018 Nov 26
3
BUGS in code generated for target i386-win32
Hi @ll,
LLVM/clang generates wrong code for the following program
(see <https://godbolt.org/z/UZrrkG>):
--- sample.c ---
unsigned __fastcall lfsr32(unsigned argument, unsigned polynomial)
{
__asm
{
add ecx, ecx ; ecx = argument << 1
sbb eax, eax ; eax = CF ? -1 : 0
and eax, edx ; eax = CF ? polynomial : 0
xor eax, ecx ; eax = (argument << 1) ^ (CF ? polynomial : 0)
}
}
int main()
{...
2015 Apr 09
6
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
...; +
> +enum vcpu_state {
> + vcpu_running = 0,
> + vcpu_halted,
> +};
> +
> +struct pv_node {
> + struct mcs_spinlock mcs;
> + struct mcs_spinlock __res[3];
> +
> + int cpu;
> + u8 state;
> +};
> +
> +/*
> + * Hash table using open addressing with an LFSR probe sequence.
> + *
> + * Since we should not be holding locks from NMI context (very rare indeed) the
> + * max load factor is 0.75, which is around the point where open addressing
> + * breaks down.
> + *
> + * Instead of probing just the immediate bucket we probe all buckets...
2015 Apr 09
6
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
...; +
> +enum vcpu_state {
> + vcpu_running = 0,
> + vcpu_halted,
> +};
> +
> +struct pv_node {
> + struct mcs_spinlock mcs;
> + struct mcs_spinlock __res[3];
> +
> + int cpu;
> + u8 state;
> +};
> +
> +/*
> + * Hash table using open addressing with an LFSR probe sequence.
> + *
> + * Since we should not be holding locks from NMI context (very rare indeed) the
> + * max load factor is 0.75, which is around the point where open addressing
> + * breaks down.
> + *
> + * Instead of probing just the immediate bucket we probe all buckets...
2015 Mar 19
4
[PATCH 8/9] qspinlock: Generic paravirt support
...NR_CPUS is kinda bloated, but it shows the idea I think.
And while this has loops in (the rehashing thing) their fwd progress
does not depend on other CPUs.
And I suspect that for the typical lock contention scenarios its
unlikely we ever really get into long rehashing chains.
---
include/linux/lfsr.h | 49 ++++++++++++
kernel/locking/qspinlock_paravirt.h | 143 ++++++++++++++++++++++++++++++++----
2 files changed, 178 insertions(+), 14 deletions(-)
--- /dev/null
+++ b/include/linux/lfsr.h
@@ -0,0 +1,49 @@
+#ifndef _LINUX_LFSR_H
+#define _LINUX_LFSR_H
+
+/*
+ * Simple Binary...
2015 Mar 19
4
[PATCH 8/9] qspinlock: Generic paravirt support
...NR_CPUS is kinda bloated, but it shows the idea I think.
And while this has loops in (the rehashing thing) their fwd progress
does not depend on other CPUs.
And I suspect that for the typical lock contention scenarios its
unlikely we ever really get into long rehashing chains.
---
include/linux/lfsr.h | 49 ++++++++++++
kernel/locking/qspinlock_paravirt.h | 143 ++++++++++++++++++++++++++++++++----
2 files changed, 178 insertions(+), 14 deletions(-)
--- /dev/null
+++ b/include/linux/lfsr.h
@@ -0,0 +1,49 @@
+#ifndef _LINUX_LFSR_H
+#define _LINUX_LFSR_H
+
+/*
+ * Simple Binary...
2015 Apr 09
0
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
...>> + vcpu_halted,
>> +};
>> +
>> +struct pv_node {
>> + struct mcs_spinlock mcs;
>> + struct mcs_spinlock __res[3];
>> +
>> + int cpu;
>> + u8 state;
>> +};
>> +
>> +/*
>> + * Hash table using open addressing with an LFSR probe sequence.
>> + *
>> + * Since we should not be holding locks from NMI context (very rare indeed) the
>> + * max load factor is 0.75, which is around the point where open addressing
>> + * breaks down.
>> + *
>> + * Instead of probing just the immediate buck...
2015 Apr 09
0
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
...; + /*
> > + * We haven't set the _Q_SLOW_VAL yet. So
> > + * the order of writing doesn't matter.
> > + */
> > + smp_wmb(); /* matches rmb from pv_hash_find */
> > + goto done;
> > + }
> > + }
> > +
> > + hash = lfsr(hash, pv_lock_hash_bits, 0);
>
> Since pv_lock_hash_bits is a variable, you end up running through that
> massive if() forest to find the corresponding tap every single time. It
> cannot compile-time optimize it.
>
> Hence:
> hash = lfsr(hash, pv_taps);
>
> (I don...
2018 Nov 26
2
BUGS in code generated for target i386-win32
...inline/using-and-preserving-registers-in-inline-assembly?view=vs-2017
Trust me: I KNOW THIS DOCUMENTATION!
> I'll try to explain a little below how that one mismatch causes the
> issues you're seeing.
>
>> BUG #1: the compiler fails to allocate (EAX for) the variable "lfsr"!
>> BUG #2: the variable "lfsr" is NOT initialized!
>
> Since the __asm isn't linked (as far as Clang is concerned) to
> either input for lfsr32, they're both unused.
REALLY? Or better: OUCH!
Is Clang NOT aware of the __fastcall calling convention and its
re...
2015 Apr 13
1
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
> >>+void __init __pv_init_lock_hash(void)
> >>+{
> >>+ int pv_hash_size = 4 * num_possible_cpus();
> >>+
> >>+ if (pv_hash_size< (1U<< LFSR_MIN_BITS))
> >>+ pv_hash_size = (1U<< LFSR_MIN_BITS);
> >>+ /*
> >>+ * Allocate space from bootmem which should be page-size aligned
> >>+ * and hence cacheline aligned.
> >>+ */
> >>+ pv_lock_hash = alloc_large_system_hash("PV q...
2015 Apr 13
1
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
> >>+void __init __pv_init_lock_hash(void)
> >>+{
> >>+ int pv_hash_size = 4 * num_possible_cpus();
> >>+
> >>+ if (pv_hash_size< (1U<< LFSR_MIN_BITS))
> >>+ pv_hash_size = (1U<< LFSR_MIN_BITS);
> >>+ /*
> >>+ * Allocate space from bootmem which should be page-size aligned
> >>+ * and hence cacheline aligned.
> >>+ */
> >>+ pv_lock_hash = alloc_large_system_hash("PV q...
2015 Mar 18
2
[PATCH 8/9] qspinlock: Generic paravirt support
On 03/16/2015 09:16 AM, Peter Zijlstra wrote:
> Implement simple paravirt support for the qspinlock.
>
> Provide a separate (second) version of the spin_lock_slowpath for
> paravirt along with a special unlock path.
>
> The second slowpath is generated by adding a few pv hooks to the
> normal slowpath, but where those will compile away for the native
> case, they expand
2015 Mar 18
2
[PATCH 8/9] qspinlock: Generic paravirt support
On 03/16/2015 09:16 AM, Peter Zijlstra wrote:
> Implement simple paravirt support for the qspinlock.
>
> Provide a separate (second) version of the spin_lock_slowpath for
> paravirt along with a special unlock path.
>
> The second slowpath is generated by adding a few pv hooks to the
> normal slowpath, but where those will compile away for the native
> case, they expand
2015 Apr 07
18
[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support
...pervisors
pvqspinlock: Implement the paravirt qspinlock for x86
Waiman Long (11):
qspinlock: A simple generic 4-byte queue spinlock
qspinlock, x86: Enable x86-64 to use queue spinlock
qspinlock: Extract out code snippets for the next patch
qspinlock: Use a simple write to grab the lock
lfsr: a simple binary Galois linear feedback shift register
pvqspinlock: Implement simple paravirt support for the qspinlock
pvqspinlock, x86: Enable PV qspinlock for KVM
pvqspinlock, x86: Enable PV qspinlock for Xen
pvqspinlock: Only kick CPU at unlock time
pvqspinlock: Improve slowpath perfo...
2015 Apr 07
18
[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support
...pervisors
pvqspinlock: Implement the paravirt qspinlock for x86
Waiman Long (11):
qspinlock: A simple generic 4-byte queue spinlock
qspinlock, x86: Enable x86-64 to use queue spinlock
qspinlock: Extract out code snippets for the next patch
qspinlock: Use a simple write to grab the lock
lfsr: a simple binary Galois linear feedback shift register
pvqspinlock: Implement simple paravirt support for the qspinlock
pvqspinlock, x86: Enable PV qspinlock for KVM
pvqspinlock, x86: Enable PV qspinlock for Xen
pvqspinlock: Only kick CPU at unlock time
pvqspinlock: Improve slowpath perfo...
2015 Apr 07
0
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
...ue_spin_unlock().
+ */
+
+#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET)
+
+enum vcpu_state {
+ vcpu_running = 0,
+ vcpu_halted,
+};
+
+struct pv_node {
+ struct mcs_spinlock mcs;
+ struct mcs_spinlock __res[3];
+
+ int cpu;
+ u8 state;
+};
+
+/*
+ * Hash table using open addressing with an LFSR probe sequence.
+ *
+ * Since we should not be holding locks from NMI context (very rare indeed) the
+ * max load factor is 0.75, which is around the point where open addressing
+ * breaks down.
+ *
+ * Instead of probing just the immediate bucket we probe all buckets in the
+ * same cacheline.
+ *...
2015 Apr 01
0
[PATCH 8/9] qspinlock: Generic paravirt support
...idea I think.
>
> And while this has loops in (the rehashing thing) their fwd progress
> does not depend on other CPUs.
>
> And I suspect that for the typical lock contention scenarios its
> unlikely we ever really get into long rehashing chains.
>
> ---
> include/linux/lfsr.h | 49 ++++++++++++
> kernel/locking/qspinlock_paravirt.h | 143 ++++++++++++++++++++++++++++++++----
> 2 files changed, 178 insertions(+), 14 deletions(-)
>
> --- /dev/null
>
> +
> +static int pv_hash_find(struct qspinlock *lock)
> +{
> + u64 hash =...
2015 Apr 02
3
[PATCH 8/9] qspinlock: Generic paravirt support
On Thu, Apr 02, 2015 at 12:28:30PM -0400, Waiman Long wrote:
> On 04/01/2015 05:03 PM, Peter Zijlstra wrote:
> >On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote:
> >>On 04/01/2015 02:48 PM, Peter Zijlstra wrote:
> >>I am sorry that I don't quite get what you mean here. My point is that in
> >>the hashing step, a cpu will need to scan an empty
2015 Apr 02
3
[PATCH 8/9] qspinlock: Generic paravirt support
On Thu, Apr 02, 2015 at 12:28:30PM -0400, Waiman Long wrote:
> On 04/01/2015 05:03 PM, Peter Zijlstra wrote:
> >On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote:
> >>On 04/01/2015 02:48 PM, Peter Zijlstra wrote:
> >>I am sorry that I don't quite get what you mean here. My point is that in
> >>the hashing step, a cpu will need to scan an empty
2008 May 07
7
questions from a 10GbE driver author
Hi,
I maintain a driver for a 10GbE nic which supports multiple hardware tx/rx rings. We can steer rx packets into rings using the "standard" NDIS6 Toeplitz hashing on TCP port numbers, IP addresses, etc. We can also steer packets based on MAC address. Would this NIC be considered to be capable of supporting crossbow?
Also, can crossbow do things like steer outgoing packets to the
2015 Apr 02
0
[PATCH 8/9] qspinlock: Generic paravirt support
...p;l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL);
>
> VS
>
> __pv_queue_spin_unlock():
>
> if (xchg(&l->locked, 0) != _Q_SLOW_VAL)
> return;
>
> /* MB as per xchg */
> pv_hash_find(lock);
>
>
Something like so.. compile tested only.
I took out the LFSR because that was likely over engineering from my
side :-)
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -2,6 +2,8 @@
#error "do not include this file"
#endif
+#include <linux/hash.h>
+
/*
* Implement paravirt qspinlocks; the general i...