thr3ads.net - search: "cacheline"

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

2014 Apr 18

1

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

...014 at 11:03:56AM -0400, Waiman Long wrote: > >>>>@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) > >>>> node->next = NULL; > >>>> > >>>> /* > >>>>+ * We touched a (possibly) cold cacheline; attempt the trylock once > >>>>+ * more in the hope someone let go while we weren't watching as long > >>>>+ * as no one was queuing. > >>>> */ > >>>>+ if (!(val& _Q_TAIL_MASK)&& queue_spin_trylock(lock)) > &gt...

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

2014 Apr 18

1

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

...014 at 11:03:56AM -0400, Waiman Long wrote: > >>>>@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) > >>>> node->next = NULL; > >>>> > >>>> /* > >>>>+ * We touched a (possibly) cold cacheline; attempt the trylock once > >>>>+ * more in the hope someone let go while we weren't watching as long > >>>>+ * as no one was queuing. > >>>> */ > >>>>+ if (!(val& _Q_TAIL_MASK)&& queue_spin_trylock(lock)) > &gt...

[LLVMdev] Set alignment of a structure?

2009 May 08

1

[LLVMdev] Set alignment of a structure?

...we wish to transfer from one thread to another. The size and types of V can only be known at compile time. We want to create a corresponding type struct.V to carry those values. In our situation, several threads will be writing to its own structure simultaneously. By aligning the structures to cacheline boundaries, we can ensure that no two of these structures occupy the same cacheline, and so no two threads are competing for that cacheline. As you noted, this case can be handled by setting an alignment on a global variable. (Case 2) We have a group of closely related threads, all running the...

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

2014 Apr 18

2

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

...jlstra wrote: > >On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote: > >>@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) > >> node->next = NULL; > >> > >> /* > >>+ * We touched a (possibly) cold cacheline; attempt the trylock once > >>+ * more in the hope someone let go while we weren't watching as long > >>+ * as no one was queuing. > >> */ > >>+ if (!(val& _Q_TAIL_MASK)&& queue_spin_trylock(lock)) > >>+ goto release; > >But...

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

2014 Apr 18

2

[PATCH v9 04/19] qspinlock: Extract out the exchange of tail code word

...jlstra wrote: > >On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote: > >>@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val) > >> node->next = NULL; > >> > >> /* > >>+ * We touched a (possibly) cold cacheline; attempt the trylock once > >>+ * more in the hope someone let go while we weren't watching as long > >>+ * as no one was queuing. > >> */ > >>+ if (!(val& _Q_TAIL_MASK)&& queue_spin_trylock(lock)) > >>+ goto release; > >But...

[LLVMdev] Set alignment of a structure?

2009 May 08

2

[LLVMdev] Set alignment of a structure?

...I understand. Setting alignment on a global variable will work for many of my needs. However, say I need to construct an array of OpaqueTypes; can I set an alignment on the elements of that array? For instance, is it possible to have an array where each element is forced to be a multiple of the cacheline size? Thank you again! -- Nick Johnson

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

2019 Aug 14

3

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

Hello Since lot of release (at least since 4.19), I hit the following error message: DMA-API: cacheline tracking ENOMEM, dma-debug disabled After hitting that, I try to check who is creating so many DMA mapping and see: cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c 6 ahci 257 e1000e 6 ehci-pci 5891 nouveau 24 uhci_hcd Does nouveau having this hi...

[LLVMdev] Set alignment of a structure?

2009 May 08

0

[LLVMdev] Set alignment of a structure?

...alignment on a global variable will work for many of my needs. > > However, say I need to construct an array of OpaqueTypes; can I set an > alignment on the elements of that array? For instance, is it possible > to have an array where each element is forced to be a multiple of the > cacheline size? I'm not sure what you mean: do you have a pointer to these, or do you have an array of pointers? Do you know the size of the elements? The compiler can't lay out a structure or array without knowing the size (not just the alignment) of the elements. -Chris

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

2019 Aug 15

1

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote: > On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote: > > Hello > > > > Since lot of release (at least since 4.19), I hit the following error message: > > DMA-API: cacheline tracking ENOMEM, dma-debug disabled > > > > After hitting that, I try to check who is creating so many DMA mapping and see: > > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c > > 6 ahci > > 257 e1000e > > 6 ehci-pci &...

[PATCH v10 03/19] qspinlock: Add pending bit

2014 May 14

2

[PATCH v10 03/19] qspinlock: Add pending bit

...dering why > > don't we use more pending bits; advantages are the same, just diminished > > by the probability of having an ideally contended lock: > > - waiter won't be blocked on RAM access if critical section (or more) > > ends sooner > > - some unlucky cacheline is not forgotten > > - faster unlock (no need for tail operations) > > (- ?) > > disadvantages are magnified: > > - increased complexity > > - intense cacheline sharing > > (I thought that this is the main disadvantage of ticketlock.) > > (- ?) > &...

[PATCH v10 03/19] qspinlock: Add pending bit

2014 May 14

2

[PATCH v10 03/19] qspinlock: Add pending bit

...dering why > > don't we use more pending bits; advantages are the same, just diminished > > by the probability of having an ideally contended lock: > > - waiter won't be blocked on RAM access if critical section (or more) > > ends sooner > > - some unlucky cacheline is not forgotten > > - faster unlock (no need for tail operations) > > (- ?) > > disadvantages are magnified: > > - increased complexity > > - intense cacheline sharing > > (I thought that this is the main disadvantage of ticketlock.) > > (- ?) > &...

[PATCH v10 03/19] qspinlock: Add pending bit

2014 May 14

2

[PATCH v10 03/19] qspinlock: Add pending bit

...-bit is effectively a lock in a lock, so I was wondering why don't we use more pending bits; advantages are the same, just diminished by the probability of having an ideally contended lock: - waiter won't be blocked on RAM access if critical section (or more) ends sooner - some unlucky cacheline is not forgotten - faster unlock (no need for tail operations) (- ?) disadvantages are magnified: - increased complexity - intense cacheline sharing (I thought that this is the main disadvantage of ticketlock.) (- ?) One bit still improved performance, is it the best we got? Thanks.

[PATCH v10 03/19] qspinlock: Add pending bit

2014 May 14

2

[PATCH v10 03/19] qspinlock: Add pending bit

...-bit is effectively a lock in a lock, so I was wondering why don't we use more pending bits; advantages are the same, just diminished by the probability of having an ideally contended lock: - waiter won't be blocked on RAM access if critical section (or more) ends sooner - some unlucky cacheline is not forgotten - faster unlock (no need for tail operations) (- ?) disadvantages are magnified: - increased complexity - intense cacheline sharing (I thought that this is the main disadvantage of ticketlock.) (- ?) One bit still improved performance, is it the best we got? Thanks.

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

2019 Aug 16

1

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote: > On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote: > > Hello > > > > Since lot of release (at least since 4.19), I hit the following error message: > > DMA-API: cacheline tracking ENOMEM, dma-debug disabled > > > > After hitting that, I try to check who is creating so many DMA mapping and see: > > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c > > 6 ahci > > 257 e1000e > > 6 ehci-pci &...

[PATCH v2 5/8] virtio/s390: use cacheline aligned airq bit vectors

2019 May 27

1

[PATCH v2 5/8] virtio/s390: use cacheline aligned airq bit vectors

On Thu, 23 May 2019 18:22:06 +0200 Michael Mueller <mimu at linux.ibm.com> wrote: > From: Halil Pasic <pasic at linux.ibm.com> > > The flag AIRQ_IV_CACHELINE was recently added to airq_iv_create(). Let > us use it! We actually wanted the vector to span a cacheline all along. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-)...

[PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019 Jul 22

2

[PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

...ode llist; /* 0 8 */ smp_call_func_t func; /* 8 8 */ void * info; /* 16 8 */ unsigned int flags; /* 24 4 */ /* size: 32, cachelines: 1, members: 4 */ /* padding: 4 */ /* last cacheline: 32 bytes */ }; struct flush_tlb_info { struct mm_struct * mm; /* 0 8 */ long unsigned int start; /* 8 8 */ long unsigned int end...

[PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019 Jul 22

2

[PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

...ode llist; /* 0 8 */ smp_call_func_t func; /* 8 8 */ void * info; /* 16 8 */ unsigned int flags; /* 24 4 */ /* size: 32, cachelines: 1, members: 4 */ /* padding: 4 */ /* last cacheline: 32 bytes */ }; struct flush_tlb_info { struct mm_struct * mm; /* 0 8 */ long unsigned int start; /* 8 8 */ long unsigned int end...

[PATCH 03/11] qspinlock: Add pending bit

2014 Jun 17

5

[PATCH 03/11] qspinlock: Add pending bit

On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter Zijlstra wrote: > Because the qspinlock needs to touch a second cacheline; add a pending > bit and allow a single in-word spinner before we punt to the second > cacheline. Could you add this in the description please: And by second cacheline we mean the local 'node'. That is the: mcs_nodes[0] and mcs_nodes[idx] Perhaps it might be better then to split th...

[PATCH 03/11] qspinlock: Add pending bit

2014 Jun 17

5

[PATCH 03/11] qspinlock: Add pending bit

On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter Zijlstra wrote: > Because the qspinlock needs to touch a second cacheline; add a pending > bit and allow a single in-word spinner before we punt to the second > cacheline. Could you add this in the description please: And by second cacheline we mean the local 'node'. That is the: mcs_nodes[0] and mcs_nodes[idx] Perhaps it might be better then to split th...

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

2019 Aug 14

0

DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote: > Hello > > Since lot of release (at least since 4.19), I hit the following error message: > DMA-API: cacheline tracking ENOMEM, dma-debug disabled > > After hitting that, I try to check who is creating so many DMA mapping and see: > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c > 6 ahci > 257 e1000e > 6 ehci-pci > 5891 nouveau >...

search for: cacheline