Yaniv Kaul
2020-Nov-26 09:49 UTC
[Gluster-users] Poor performance on a server-class system vs. desktop
On Thu, Nov 26, 2020 at 11:44 AM Dmitry Antipov <dmantipov at yandex.ru> wrote:> BTW, did someone try to profile the brick process? I do, and got this > for the default replica 3 volume ('perf record -F 2500 -g -p [PID]'): >I run a slightly different command, which hides the kernel stuff and focuses on the user mode functions: sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p `pgrep -d\, gluster` -F 2000 -ag Y.> + 3.29% 0.02% glfs_epoll001 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 3.17% 0.01% glfs_epoll001 [kernel.kallsyms] [k] > do_syscall_64 > + 3.17% 0.02% glfs_epoll000 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 3.06% 0.02% glfs_epoll000 [kernel.kallsyms] [k] > do_syscall_64 > + 2.75% 0.01% glfs_iotwr00f [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.74% 0.01% glfs_iotwr00b [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.74% 0.01% glfs_iotwr001 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.73% 0.00% glfs_iotwr003 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.72% 0.00% glfs_iotwr000 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.72% 0.01% glfs_iotwr00c [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.70% 0.01% glfs_iotwr003 [kernel.kallsyms] [k] > do_syscall_64 > + 2.69% 0.00% glfs_iotwr001 [kernel.kallsyms] [k] > do_syscall_64 > + 2.69% 0.01% glfs_iotwr008 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.68% 0.00% glfs_iotwr00b [kernel.kallsyms] [k] > do_syscall_64 > + 2.68% 0.00% glfs_iotwr00c [kernel.kallsyms] [k] > do_syscall_64 > + 2.68% 0.00% glfs_iotwr00f [kernel.kallsyms] [k] > do_syscall_64 > + 2.68% 0.01% glfs_iotwr000 [kernel.kallsyms] [k] > do_syscall_64 > + 2.67% 0.00% glfs_iotwr00a [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.65% 0.00% glfs_iotwr008 [kernel.kallsyms] [k] > do_syscall_64 > + 2.64% 0.00% glfs_iotwr00e [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.64% 0.01% glfs_iotwr00d [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.63% 0.01% glfs_iotwr00a [kernel.kallsyms] [k] > do_syscall_64 > + 2.63% 0.01% glfs_iotwr007 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.63% 0.00% glfs_iotwr005 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.63% 0.01% glfs_iotwr006 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.63% 0.00% glfs_iotwr009 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.61% 0.01% glfs_iotwr004 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.61% 0.01% glfs_iotwr00e [kernel.kallsyms] [k] > do_syscall_64 > + 2.60% 0.00% glfs_iotwr006 [kernel.kallsyms] [k] > do_syscall_64 > + 2.59% 0.00% glfs_iotwr005 [kernel.kallsyms] [k] > do_syscall_64 > + 2.59% 0.00% glfs_iotwr00d [kernel.kallsyms] [k] > do_syscall_64 > + 2.58% 0.00% glfs_iotwr002 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.58% 0.01% glfs_iotwr007 [kernel.kallsyms] [k] > do_syscall_64 > + 2.58% 0.00% glfs_iotwr004 [kernel.kallsyms] [k] > do_syscall_64 > + 2.57% 0.00% glfs_iotwr009 [kernel.kallsyms] [k] > do_syscall_64 > + 2.54% 0.00% glfs_iotwr002 [kernel.kallsyms] [k] > do_syscall_64 > + 1.65% 0.00% glfs_epoll000 [unknown] [k] > 0x0000000000000001 > + 1.65% 0.00% glfs_epoll001 [unknown] [k] > 0x0000000000000001 > + 1.48% 0.01% glfs_rpcrqhnd [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 1.44% 0.08% glfs_rpcrqhnd libpthread-2.32.so [.] > pthread_cond_wait@@GLIBC_2.3.2 > + 1.40% 0.01% glfs_rpcrqhnd [kernel.kallsyms] [k] > do_syscall_64 > + 1.36% 0.01% glfs_rpcrqhnd [kernel.kallsyms] [k] > __x64_sys_futex > + 1.35% 0.03% glfs_rpcrqhnd [kernel.kallsyms] [k] do_futex > + 1.34% 0.01% glfs_iotwr00a libpthread-2.32.so [.] > __libc_pwrite64 > + 1.32% 0.00% glfs_iotwr00a [kernel.kallsyms] [k] > __x64_sys_pwrite64 > + 1.32% 0.00% glfs_iotwr001 libpthread-2.32.so [.] > __libc_pwrite64 > + 1.31% 0.01% glfs_iotwr002 libpthread-2.32.so [.] > __libc_pwrite64 > + 1.31% 0.00% glfs_iotwr00b libpthread-2.32.so [.] > __libc_pwrite64 > + 1.31% 0.01% glfs_iotwr00a [kernel.kallsyms] [k] vfs_write > + 1.30% 0.00% glfs_iotwr001 [kernel.kallsyms] [k] > __x64_sys_pwrite64 > + 1.30% 0.00% glfs_iotwr008 libpthread-2.32.so [.] > __libc_pwrite64 > + 1.30% 0.00% glfs_iotwr00a [kernel.kallsyms] [k] > new_sync_write > + 1.30% 0.00% glfs_iotwr00c libpthread-2.32.so [.] > __libc_pwrite64 > + 1.29% 0.00% glfs_iotwr00a [kernel.kallsyms] [k] > xfs_file_write_iter > + 1.29% 0.01% glfs_iotwr00a [kernel.kallsyms] [k] > xfs_file_dio_aio_write > > And on replica 3 with storage.linux-aio enabled: > > + 11.76% 0.05% glfs_posixaio [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 11.42% 0.01% glfs_posixaio [kernel.kallsyms] [k] > do_syscall_64 > + 8.81% 0.00% glfs_posixaio [unknown] [k] > 0x00000000baadf00d > + 8.81% 0.00% glfs_posixaio [unknown] [k] > 0x0000000000000004 > + 8.74% 0.06% glfs_posixaio libc-2.32.so [.] > __GI___writev > + 8.33% 0.02% glfs_posixaio [kernel.kallsyms] [k] do_writev > + 8.23% 0.03% glfs_posixaio [kernel.kallsyms] [k] > vfs_writev > + 8.12% 0.05% glfs_posixaio [kernel.kallsyms] [k] > do_iter_write > + 8.02% 0.05% glfs_posixaio [kernel.kallsyms] [k] > do_iter_readv_writev > + 7.96% 0.04% glfs_posixaio [kernel.kallsyms] [k] > sock_write_iter > + 7.92% 0.01% glfs_posixaio [kernel.kallsyms] [k] > sock_sendmsg > + 7.86% 0.01% glfs_posixaio [kernel.kallsyms] [k] > tcp_sendmsg > + 7.28% 0.15% glfs_posixaio [kernel.kallsyms] [k] > tcp_sendmsg_locked > + 6.49% 0.01% glfs_posixaio [kernel.kallsyms] [k] > __tcp_push_pending_frames > + 6.48% 0.10% glfs_posixaio [kernel.kallsyms] [k] > tcp_write_xmit > + 6.31% 0.02% glfs_posixaio [unknown] [k] > 0000000000000000 > + 6.05% 0.13% glfs_posixaio [kernel.kallsyms] [k] > __tcp_transmit_skb > + 5.71% 0.06% glfs_posixaio [kernel.kallsyms] [k] > __ip_queue_xmit > + 4.15% 0.03% glfs_rpcrqhnd [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 4.07% 0.08% glfs_posixaio [kernel.kallsyms] [k] > ip_finish_output2 > + 3.75% 0.02% glfs_posixaio [kernel.kallsyms] [k] > asm_call_sysvec_on_stack > + 3.75% 0.01% glfs_rpcrqhnd [kernel.kallsyms] [k] > do_syscall_64 > + 3.70% 0.03% glfs_rpcrqhnd [kernel.kallsyms] [k] > __x64_sys_futex > + 3.68% 0.06% glfs_posixaio [kernel.kallsyms] [k] > __local_bh_enable_ip > + 3.67% 0.07% glfs_rpcrqhnd [kernel.kallsyms] [k] do_futex > + 3.62% 0.05% glfs_posixaio [kernel.kallsyms] [k] > do_softirq > + 3.61% 0.01% glfs_posixaio [kernel.kallsyms] [k] > do_softirq_own_stack > + 3.59% 0.06% glfs_posixaio [kernel.kallsyms] [k] > __softirqentry_text_start > + 3.44% 0.06% glfs_posixaio [kernel.kallsyms] [k] > net_rx_action > + 3.34% 0.04% glfs_posixaio [kernel.kallsyms] [k] > process_backlog > + 3.28% 0.02% glfs_posixaio [kernel.kallsyms] [k] > __netif_receive_skb_one_core > + 3.08% 0.02% glfs_epoll000 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 3.02% 0.03% glfs_epoll001 [kernel.kallsyms] [k] > entry_SYSCALL_64_after_hwframe > + 2.97% 0.01% glfs_epoll000 [kernel.kallsyms] [k] > do_syscall_64 > + 2.89% 0.01% glfs_epoll001 [kernel.kallsyms] [k] > do_syscall_64 > + 2.73% 0.08% glfs_posixaio [kernel.kallsyms] [k] > nf_hook_slow > + 2.25% 0.04% glfs_posixaio libc-2.32.so [.] > fgetxattr > + 2.16% 0.14% glfs_rpcrqhnd [kernel.kallsyms] [k] > futex_wake > > According to these tables, the brick process is just a thin wrapper for > the system calls > and kernel network subsystem behind them. > > To whom it may be interesting, the following replica 3 volume options: > > performance.io-cache-pass-through: on > performance.iot-pass-through: on > performance.md-cache-pass-through: on > performance.nl-cache-pass-through: on > performance.open-behind-pass-through: on > performance.read-ahead-pass-through: on > performance.readdir-ahead-pass-through: on > performance.strict-o-direct: on > features.ctime: off > features.selinux: off > performance.write-behind: off > performance.open-behind: off > performance.quick-read: off > storage.linux-aio: on > storage.fips-mode-rchecksum: off > > are likely to improve the I/O performance of GFAPI clients (fio with gfapi > and gfapi_async > engines, qemu -drive file=gluster://XXX, etc.) by ~20%. But beware of > killing I/O performance > of FUSE clients. > > Dmitry > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201126/ff70f097/attachment.html>
Dmitry Antipov
2020-Nov-26 12:31 UTC
[Gluster-users] Poor performance on a server-class system vs. desktop
On 11/26/20 12:49 PM, Yaniv Kaul wrote:> I run a slightly different command, which hides the kernel stuff and focuses on the user mode functions: > sudo perf record --call-graph dwarf -j any --buildid-all --all-user -p `pgrep -d\, gluster` -F 2000 -agThanks. BTW, how much is an overhead of passing data between xlators? Even if the most of their features are disabled, just passing through all of the below is unlikely to have near-to-zero overhead: Thread 14 (Thread 0x7f2c0e7fc640 (LWP 19482) "glfs_rpcrqhnd"): #0 data_unref (this=0x7f2bfc032e68) at dict.c:768 #1 0x00007f2c290b90b9 in dict_deln (keylen=<optimized out>, key=0x7f2c163d542e "glusterfs.inodelk-dom-count", this=0x7f2bfc0bb1c8) at dict.c:645 #2 dict_deln (this=0x7f2bfc0bb1c8, key=0x7f2c163d542e "glusterfs.inodelk-dom-count", keylen=<optimized out>) at dict.c:614 #3 0x00007f2c163c87ee in pl_get_xdata_requests (local=0x7f2bfc0ea658, xdata=0x7f2bfc0bb1c8) at posix.c:238 #4 0x00007f2c163b3267 in pl_get_xdata_requests (xdata=0x7f2bfc0bb1c8, local=<optimized out>) at posix.c:213 #5 pl_writev (frame=0x7f2bfc0d5348, this=0x7f2c08014830, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, offset=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at posix.c:2299 #6 0x00007f2c16395e31 in worm_writev (frame=0x7f2bfc0d5348, this=<optimized out>, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, offset=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at worm.c:429 #7 0x00007f2c1638a55f in ro_writev (frame=frame at entry=0x7f2bfc0d5348, this=<optimized out>, fd=fd at entry=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=count at entry=1, off=off at entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at read-only-common.c:374 #8 0x00007f2c163705ac in leases_writev (frame=frame at entry=0x7f2bfc0bf148, this=0x7f2c0801a230, fd=fd at entry=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=count at entry=1, off=off at entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at leases.c:132 #9 0x00007f2c1634f6a8 in up_writev (frame=0x7f2bfc067508, this=0x7f2c0801bf00, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at upcall.c:124 #10 0x00007f2c2913e6c2 in default_writev (frame=0x7f2bfc067508, this=<optimized out>, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at defaults.c:2550 #11 0x00007f2c2913e6c2 in default_writev (frame=frame at entry=0x7f2bfc067508, this=<optimized out>, fd=fd at entry=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=count at entry=1, off=off at entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at defaults.c:2550 #12 0x00007f2c16315eb7 in marker_writev (frame=frame at entry=0x7f2bfc119e48, this=this at entry=0x7f2c08021440, fd=fd at entry=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=count at entry=1, offset=offset at entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at marker.c:940 #13 0x00007f2c162fc0ab in barrier_writev (frame=0x7f2bfc119e48, this=<optimized out>, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at barrier.c:248 #14 0x00007f2c2913e6c2 in default_writev (frame=0x7f2bfc119e48, this=<optimized out>, fd=0x7f2bfc0bc768, vector=0x7f2bfc105478, count=1, off=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at defaults.c:2550 #15 0x00007f2c162c5cda in quota_writev (frame=frame at entry=0x7f2bfc119e48, this=<optimized out>, fd=fd at entry=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=count at entry=1, off=off at entry=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at quota.c:1947 #16 0x00007f2c16299c89 in io_stats_writev (frame=frame at entry=0x7f2bfc0e4358, this=this at entry=0x7f2c08029df0, fd=0x7f2bfc0bc768, vector=vector at entry=0x7f2bfc105478, count=1, offset=108306432, flags=0, iobref=0x7f2c080820d0, xdata=0x7f2bfc0bb1c8) at io-stats.c:2893 #17 0x00007f2c161f01ac in server4_writev_resume (frame=0x7f2bfc0ef5c8, bound_xl=0x7f2c08029df0) at server-rpc-fops_v2.c:3017 #18 0x00007f2c161f901c in resolve_and_resume (fn=<optimized out>, frame=<optimized out>) at server-resolve.c:680 #19 server4_0_writev (req=<optimized out>) at server-rpc-fops_v2.c:3943 #20 0x00007f2c290696e5 in rpcsvc_request_handler (arg=0x7f2c1614c0b8) at rpcsvc.c:2233 #21 0x00007f2c28ffa3f9 in start_thread (arg=0x7f2c0e7fc640) at pthread_create.c:463 #22 0x00007f2c28f25903 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Dmitry