search for: dcache

Displaying 20 results from an estimated 129 matches for "dcache".

Did you mean: cache
2015 Nov 18
2
[PATCH] virtio_ring: Shadow available ring flags & index
...(note -- w/ core turbo disabled, performance is _very_ stable; variance of < 0.5% run-to-run; figure of merit is "seconds elapsed" here) * Producer / consumer bound to Hyperthread pairs: Performance counter stats for './vring_bench_noshadow 1000000000': 343,425,166,916 L1-dcache-loads 21,393,148 L1-dcache-load-misses # 0.01% of all L1-dcache hits 61,709,640,363 L1-dcache-stores 5,745,690 L1-dcache-store-misses 10,186,932,553 L1-dcache-prefetches 1,491 L1-dcache-prefetch-misses 121.335699344 seconds time elapsed Performance counter st...
2015 Nov 18
2
[PATCH] virtio_ring: Shadow available ring flags & index
...(note -- w/ core turbo disabled, performance is _very_ stable; variance of < 0.5% run-to-run; figure of merit is "seconds elapsed" here) * Producer / consumer bound to Hyperthread pairs: Performance counter stats for './vring_bench_noshadow 1000000000': 343,425,166,916 L1-dcache-loads 21,393,148 L1-dcache-load-misses # 0.01% of all L1-dcache hits 61,709,640,363 L1-dcache-stores 5,745,690 L1-dcache-store-misses 10,186,932,553 L1-dcache-prefetches 1,491 L1-dcache-prefetch-misses 121.335699344 seconds time elapsed Performance counter st...
2015 Nov 18
0
[PATCH] virtio_ring: Shadow available ring flags & index
...rformance is _very_ stable; variance of > < 0.5% run-to-run; figure of merit is "seconds elapsed" here) > > * Producer / consumer bound to Hyperthread pairs: > > Performance counter stats for './vring_bench_noshadow 1000000000': > > 343,425,166,916 L1-dcache-loads > 21,393,148 L1-dcache-load-misses # 0.01% of all L1-dcache hits > 61,709,640,363 L1-dcache-stores > 5,745,690 L1-dcache-store-misses > 10,186,932,553 L1-dcache-prefetches > 1,491 L1-dcache-prefetch-misses > 121.335699344 seconds time el...
2015 Nov 19
1
[PATCH] virtio_ring: Shadow available ring flags & index
...e; variance of >> < 0.5% run-to-run; figure of merit is "seconds elapsed" here) >> >> * Producer / consumer bound to Hyperthread pairs: >> >> Performance counter stats for './vring_bench_noshadow 1000000000': >> >> 343,425,166,916 L1-dcache-loads >> 21,393,148 L1-dcache-load-misses # 0.01% of all L1-dcache hits >> 61,709,640,363 L1-dcache-stores >> 5,745,690 L1-dcache-store-misses >> 10,186,932,553 L1-dcache-prefetches >> 1,491 L1-dcache-prefetch-misses >> 121.3...
2015 Nov 13
2
[PATCH] virtio_ring: Shadow available ring flags & index
...nch, the time required for > > 10,000,000 buffer checkout/returns was reduced by ~2% (average > > across many runs) on an AMD Piledriver (15h) CPU: > > > > (w/o shadowing): > > Performance counter stats for './vring_bench': > > 5,451,082,016 L1-dcache-loads > > ... > > 2.221477739 seconds time elapsed > > > > (w/ shadowing): > > Performance counter stats for './vring_bench': > > 5,405,701,361 L1-dcache-loads > > ... > > 2.168405376 seconds time elapsed &gt...
2015 Nov 13
2
[PATCH] virtio_ring: Shadow available ring flags & index
...nch, the time required for > > 10,000,000 buffer checkout/returns was reduced by ~2% (average > > across many runs) on an AMD Piledriver (15h) CPU: > > > > (w/o shadowing): > > Performance counter stats for './vring_bench': > > 5,451,082,016 L1-dcache-loads > > ... > > 2.221477739 seconds time elapsed > > > > (w/ shadowing): > > Performance counter stats for './vring_bench': > > 5,405,701,361 L1-dcache-loads > > ... > > 2.168405376 seconds time elapsed &gt...
2004 Jun 14
0
[PATCH] dcache.c polishing
kill dead ocfs_empty stuff, cleanup d_revalidate handling. Index: dcache.c =================================================================== --- dcache.c (revision 1091) +++ dcache.c (working copy) @@ -44,24 +44,11 @@ #define OCFS_DEBUG_CONTEXT OCFS_DEBUG_CONTEXT_DCACHE -static int ocfs_empty_func(struct dentry *dentry, void *ignore); - -/* - * ocfs_dentry_rev...
2013 Jan 15
0
[LLVMdev] Dynamic Profiling - Instrumentation basic query
...the execution time, or is it the profiling itself? I can > probably use a data structure to store the output instead. > > Also, I have heard of Intel's Pin tool which can provide memory trace > information. Could you please explain to me what you meant by hardware > counters for dcache miss/hit rates. I've also heard of Pin, but never actually used it. Regarding the hardware counters: x86 processors count various hardware events via internal counters. I think both Intel and AMD processors can do this, but I've only tried out Intel. The easiest way to access these o...
2013 Jan 14
2
[LLVMdev] Dynamic Profiling - Instrumentation basic query
...to a file would increase the execution time, or is it the profiling itself? I can probably use a data structure to store the output instead. Also, I have heard of Intel's Pin tool which can provide memory trace information. Could you please explain to me what you meant by hardware counters for dcache miss/hit rates. @Criswell: Thank you so much for helping me with this. I am starting to write my own code, but having a look at the existing code would definitely help me. Thanks and Regards, Silky On Mon, Jan 14, 2013 at 12:06 AM, Criswell, John T <criswell at illinois.edu>wrote: > T...
2009 Jul 20
1
[PATCH] ocfs2: flush dentry lock drop when sync ocfs2 volume.
...t lead to at least 2 bugs. See http://oss.oracle.com/bugzilla/show_bug.cgi?id=1133 and http://oss.oracle.com/bugzilla/show_bug.cgi?id=1135. And it happens easily if we have opened a lot of inodes. For 1135, the reason is that during umount will call generic_shutdown_super and it will do: 1. shrink_dcache_for_umount 2. sync_filesystem. 3. invalidate_inodes. In shrink_dcache_for_umount, we will drop the dentry, and queue ocfs2_wq for dentry lock put. While in invalidate_inodes we will call invalidate_list which will iterate all the inodes for the sb. The bad thing is that in this function it will ca...
2019 Apr 09
2
[PATCH net] vhost: flush dcache page when logging dirty pages
We set dirty bit through setting up kmaps and access them through kernel virtual address, this may result alias in virtually tagged caches that require a dcache flush afterwards. Cc: Christoph Hellwig <hch at infradead.org> Cc: James Bottomley <James.Bottomley at HansenPartnership.com> Cc: Andrea Arcangeli <aarcange at redhat.com> Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jas...
2019 Apr 09
2
[PATCH net] vhost: flush dcache page when logging dirty pages
We set dirty bit through setting up kmaps and access them through kernel virtual address, this may result alias in virtually tagged caches that require a dcache flush afterwards. Cc: Christoph Hellwig <hch at infradead.org> Cc: James Bottomley <James.Bottomley at HansenPartnership.com> Cc: Andrea Arcangeli <aarcange at redhat.com> Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jas...
2019 Apr 09
0
[PATCH net] vhost: flush dcache page when logging dirty pages
On Tue, Apr 09, 2019 at 12:16:47PM +0800, Jason Wang wrote: > We set dirty bit through setting up kmaps and access them through > kernel virtual address, this may result alias in virtually tagged > caches that require a dcache flush afterwards. > > Cc: Christoph Hellwig <hch at infradead.org> > Cc: James Bottomley <James.Bottomley at HansenPartnership.com> > Cc: Andrea Arcangeli <aarcange at redhat.com> > Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") This is...
2015 Nov 17
0
[PATCH] virtio_ring: Shadow available ring flags & index
...t;> core -> core optimally. >> Sounds logical, I'll apply this after a bit of testing >> of my own, thanks! > Thanks! Venkatesh: Is it that your patch only applies to CPUs w/ exclusive caches? Do you have perf data on Intel CPUs? For the perf metric you provide, why not L1-dcache-load-misses which is more meaning full? > >>> In a concurrent version of vring_bench, the time required for >>> 10,000,000 buffer checkout/returns was reduced by ~2% (average >>> across many runs) on an AMD Piledriver (15h) CPU: >>> >>> (w/o shadowing...
2009 Apr 21
1
[PATCH 1/1] ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock()
...owing message: (3996,1):dlm_empty_lockres:2708 ERROR: lockres W00000000000000000a1046b06a4382 still has local locks! kernel BUG in dlm_empty_lockres at /rpmbuild/smushran/BUILD/ocfs2-1.4.2/fs/ocfs2/dlm/dlmmaster.c:2709! Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/dcache.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c index 7d60448..b574431 100644 --- a/fs/ocfs2/dcache.c +++ b/fs/ocfs2/dcache.c @@ -290,6 +290,21 @@ out_attach: else mlog_errno(ret); + /* + * In case of error, man...
2009 May 04
2
[PATCH 1/3] ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock()
...W00000000000000000a1046b06a4382 still has local locks! kernel BUG in dlm_empty_lockres at /rpmbuild/smushran/BUILD/ocfs2-1.4.2/fs/ocfs2/dlm/dlmmaster.c:2709! Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> Signed-off-by: Joel Becker <joel.becker at oracle.com> --- fs/ocfs2/dcache.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c index fab4911..6bf070a 100644 --- a/fs/ocfs2/dcache.c +++ b/fs/ocfs2/dcache.c @@ -289,6 +289,21 @@ out_attach: else mlog_errno(ret); + /* + * In case of error, man...
2010 Aug 20
0
[PATCH] ocfs2: Don't delete orphaned files if we are in the process of umount.
...ds this flag, it will skip the process of ocfs2_delete_inode if the file is from orphan dir. We are safe to skip the delete process since it is in orphan dir, so it will be deleted eventually by other orphan scan, next mount or fsck. Signed-off-by: Tao Ma <tao.ma at oracle.com> --- fs/ocfs2/dcache.c | 4 ++-- fs/ocfs2/inode.c | 24 ++++++++++++++++++++++-- fs/ocfs2/ocfs2.h | 2 +- fs/ocfs2/super.c | 2 +- 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c index b4957c7..827ccb8 100644 --- a/fs/ocfs2/dcache.c +++ b/fs/ocfs2/dcac...
2015 May 06
2
[PATCH 0/6] x86: reduce paravirtualized spinlock overhead
...about 600 to 500 cycles. >> >> spin_unlock() for first time dropped from 145 to 87 cycles. >> >> spin_lock() in a loop dropped from 48 to 45 cycles. >> >> spin_unlock() in the same loop dropped from 24 to 22 cycles. > > Did you isolate icache hot/cold from dcache hot/cold? It seems to me the > main difference will be whether the branch predictor is warmed up rather > than if the lock itself is in dcache, but its much more likely that the > lock code is icache if the code is lock intensive, making the cold case > moot. But that's pure specula...
2015 May 06
2
[PATCH 0/6] x86: reduce paravirtualized spinlock overhead
...about 600 to 500 cycles. >> >> spin_unlock() for first time dropped from 145 to 87 cycles. >> >> spin_lock() in a loop dropped from 48 to 45 cycles. >> >> spin_unlock() in the same loop dropped from 24 to 22 cycles. > > Did you isolate icache hot/cold from dcache hot/cold? It seems to me the > main difference will be whether the branch predictor is warmed up rather > than if the lock itself is in dcache, but its much more likely that the > lock code is icache if the code is lock intensive, making the cold case > moot. But that's pure specula...
2015 May 04
2
[PATCH 0/6] x86: reduce paravirtualized spinlock overhead
On 04/30/2015 06:39 PM, Jeremy Fitzhardinge wrote: > On 04/30/2015 03:53 AM, Juergen Gross wrote: >> Paravirtualized spinlocks produce some overhead even if the kernel is >> running on bare metal. The main reason are the more complex locking >> and unlocking functions. Especially unlocking is no longer just one >> instruction but so complex that it is no longer inlined.