Riccardo Murri
2017-Dec-05 11:11 UTC
[Gluster-users] SAMBA VFS module for GlusterFS crashes
Hello, I'm trying to set up a SAMBA server serving a GlusterFS volume. Everything works fine if I locally mount the GlusterFS volume (`mount -t glusterfs ...`) and then serve the mounted FS through SAMBA, but the performance is slower by a 2x/3x compared to a SAMBA server with a local ext4 filesystem. I gather that SAMBA vfs_glusterfs module can give better performance. However, as soon as I switch SMBD to using GFAPI (`vfs objects = glusterfs`), the SMBD process crashes leaving a backtrace in the logs. I can see no errors in the associated GlusterFS log file and everything runs fine when locally-mounting. I am copying below the share definition, SMBD crash dump from the logs, and the GlusterFS client logs (produced by the `vfs_glusterfs.so` module in SMBD). I'm using Ubuntu 16.04.3 with samba package 2:4.3.11+dfsg-0ubuntu0.16.04.12 (I locally recompiled `samba-vfs-modules` to enable the GlusterFS VFS which is by default disabled in Ubuntu builds). Any suggestions on what/where to look? Alternatively: how to improve the performance of SAMBA over locally-mounted GlusterFS? right now, I'm getting a 2x to 4x slowdown compared to SAMBA with local disk. Thanks for any help! Riccardo ### share definition from /etc/samba/smb.conf [archive] browsable =yes writable = yes create mask = 0775 directory mask = 0775 guest ok = yes vfs objects = glusterfs kernel share modes = No glusterfs:loglevel = 7 glusterfs:logfile = /var/log/samba/glusterfs-vol-archive.log glusterfs:volfile_server = glusterfs-01 glusterfs:volume = archive path = / ### SMBD crash from SAMBA logs Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.996199, 0] ../source3/modules/vfs_glusterfs.c:258(vfs_gluster_connect) Dec 5 10:20:13 samba-04 smbd[31932]: archive: Initialized volume from server glusterfs-01 Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.999371, 0] ../lib/util/fault.c:78(fault_report) Dec 5 10:20:13 samba-04 smbd[31932]: ==============================================================Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000042, 0] ../lib/util/fault.c:79(fault_report) Dec 5 10:20:14 samba-04 smbd[31932]: INTERNAL ERROR: Signal 6 in pid 31932 (4.3.11-Ubuntu) Dec 5 10:20:14 samba-04 smbd[31932]: Please read the Trouble-Shooting section of the Samba HOWTO Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000594, 0] ../lib/util/fault.c:81(fault_report) Dec 5 10:20:14 samba-04 smbd[31932]: ==============================================================Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000954, 0] ../source3/lib/util.c:789(smb_panic_s3) Dec 5 10:20:14 samba-04 smbd[31932]: PANIC (pid 31932): internal error Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.001712, 0] ../source3/lib/util.c:900(log_stack_trace) Dec 5 10:20:14 samba-04 smbd[31932]: BACKTRACE: 29 stack frames: Dec 5 10:20:14 samba-04 smbd[31932]: #0 /usr/lib/x86_64-linux-gnu/samba/libsmbregistry.so.0(log_stack_trace+0x1a) [0x7f5e8b5427aa] Dec 5 10:20:14 samba-04 smbd[31932]: #1 /usr/lib/x86_64-linux-gnu/samba/libsmbregistry.so.0(smb_panic_s3+0x20) [0x7f5e8b542880] Dec 5 10:20:14 samba-04 smbd[31932]: #2 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(smb_panic+0x2f) [0x7f5e8c2b5f1f] Dec 5 10:20:14 samba-04 smbd[31932]: #3 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(+0x1b136) [0x7f5e8c2b6136] Dec 5 10:20:14 samba-04 smbd[31932]: #4 /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f5e8c514390] Dec 5 10:20:14 samba-04 smbd[31932]: #5 /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7f5e88a87428] Dec 5 10:20:14 samba-04 smbd[31932]: #6 /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f5e88a8902a] Dec 5 10:20:14 samba-04 smbd[31932]: #7 /lib/x86_64-linux-gnu/libc.so.6(+0x777ea) [0x7f5e88ac97ea] Dec 5 10:20:14 samba-04 smbd[31932]: #8 /lib/x86_64-linux-gnu/libc.so.6(+0x8037a) [0x7f5e88ad237a] Dec 5 10:20:14 samba-04 smbd[31932]: #9 /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c) [0x7f5e88ad653c] Dec 5 10:20:14 samba-04 smbd[31932]: #10 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x117955) [0x7f5e8be88955] Dec 5 10:20:14 samba-04 smbd[31932]: #11 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1188b7) [0x7f5e8be898b7] Dec 5 10:20:14 samba-04 smbd[31932]: #12 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(make_connection_smb2+0x62) [0x7f5e8be8a662] Dec 5 10:20:14 samba-04 smbd[31932]: #13 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_process_tcon+0x6ad) [0x7f5e8be9e48d] Dec 5 10:20:14 samba-04 smbd[31932]: #14 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_dispatch+0x974) [0x7f5e8be98bd4] Dec 5 10:20:14 samba-04 smbd[31932]: #15 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1287fb) [0x7f5e8be997fb] Dec 5 10:20:14 samba-04 smbd[31932]: #16 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x167) [0x7f5e8a1d5917] Dec 5 10:20:14 samba-04 smbd[31932]: #17 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) [0x7f5e8a1d5b77] Dec 5 10:20:14 samba-04 smbd[31932]: #18 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f5e88e1fd3d] Dec 5 10:20:14 samba-04 smbd[31932]: #19 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f5e88e1fedb] Dec 5 10:20:14 samba-04 smbd[31932]: #20 /usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_process+0x718) [0x7f5e8be88578] Dec 5 10:20:14 samba-04 smbd[31932]: #21 /usr/sbin/smbd(+0x8e12) [0x55ba26c51e12] Dec 5 10:20:14 samba-04 smbd[31932]: #22 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x167) [0x7f5e8a1d5917] Dec 5 10:20:14 samba-04 smbd[31932]: #23 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) [0x7f5e8a1d5b77] Dec 5 10:20:14 samba-04 smbd[31932]: #24 /usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f5e88e1fd3d] Dec 5 10:20:14 samba-04 smbd[31932]: #25 /usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f5e88e1fedb] Dec 5 10:20:14 samba-04 smbd[31932]: #26 /usr/sbin/smbd(main+0x1899) [0x55ba26c50099] Dec 5 10:20:14 samba-04 smbd[31932]: #27 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f5e88a72830] Dec 5 10:20:14 samba-04 smbd[31932]: #28 /usr/sbin/smbd(_start+0x29) [0x55ba26c50199] Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.006804, 0] ../source3/lib/dumpcore.c:303(dump_core) Dec 5 10:20:14 samba-04 smbd[31932]: dumping core in /var/log/samba/cores/smbd Dec 5 10:20:14 samba-04 smbd[31932]: Dec 5 10:20:14 samba-04 smbd[31944]: [2017/12/05 10:20:14.213971, 0] ../source3/smbd/smb2_server.c:547(smbd_smb2_request_create) Dec 5 10:20:14 samba-04 smbd[31944]: Invalid SMB packet: first request: 0x0003 ### GlusterFS client logs (issued from the `vfs_gluster.so` module within `smbd`) [2017-12-05 10:20:13.949861] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-gfapi: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction [2017-12-05 10:20:13.965822] I [MSGID: 122067] [ec-code.c:1046:ec_code_detect] 0-archive-disperse-0: Using 'avx' CPU extensions [2017-12-05 10:20:13.969808] W [MSGID: 101174] [graph.c:363:_log_if_unknown_option] 0-archive-readdir-ahead: option 'parallel-readdir' is not recognized [2017-12-05 10:20:13.970295] I [MSGID: 104045] [glfs-master.c:91:notify] 0-gfapi: New graph 70656c6b-6d61-6e73-6c61-622d73616d62 (0) coming up [2017-12-05 10:20:13.970352] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-0: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.970718] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-1: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971011] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-2: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971205] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-3: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971401] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-4: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971576] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-5: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971770] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-6: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.971985] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-7: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.972254] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-8: parent translators are ready, attempting connect on transport [2017-12-05 10:20:13.972495] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-9: parent translators are ready, attempting connect on transport Final graph: +------------------------------------------------------------------------------+ 1: volume archive-client-0 2: type protocol/client 3: option ping-timeout 42 4: option remote-host glusterfs-10 5: option remote-subvolume /srv/glusterfs/brick/archive 6: option transport-type socket 7: option transport.address-family inet 8: option transport.tcp-user-timeout 0 9: option transport.socket.keepalive-time 20 10: option transport.socket.keepalive-interval 2 11: option transport.socket.keepalive-count 9 12: option send-gids true 13: end-volume 14: 15: volume archive-client-1 16: type protocol/client 17: option ping-timeout 42 18: option remote-host glusterfs-09 19: option remote-subvolume /srv/glusterfs/brick/archive 20: option transport-type socket 21: option transport.address-family inet 22: option transport.tcp-user-timeout 0 23: option transport.socket.keepalive-time 20 24: option transport.socket.keepalive-interval 2 25: option transport.socket.keepalive-count 9 26: option send-gids true 27: end-volume 28: 28: 29: volume archive-client-2 30: type protocol/client 31: option ping-timeout 42 [2017-12-05 10:20:13.972846] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-0: changing port to 49153 (from 0) 32: option remote-host glusterfs-08 33: option remote-subvolume /srv/glusterfs/brick/archive 34: option transport-type socket 35: option transport.address-family inet 36: option transport.tcp-user-timeout 0 37: option transport.socket.keepalive-time 20 38: option transport.socket.keepalive-interval 2 39: option transport.socket.keepalive-count 9 40: option send-gids true 41: end-volume 42: 43: volume archive-client-3 44: type protocol/client 45: option ping-timeout 42 46: option remote-host glusterfs-07 47: option remote-subvolume /srv/glusterfs/brick/archive 48: option transport-type socket 49: option transport.address-family inet 50: option transport.tcp-user-timeout 0 [2017-12-05 10:20:13.972929] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-3: changing port to 49153 (from 0) 51: option transport.socket.keepalive-time 20 52: option transport.socket.keepalive-interval 2 53: option transport.socket.keepalive-count 9 54: option send-gids true 55: end-volume 56: 57: volume archive-client-4 58: type protocol/client 59: option ping-timeout 42 60: option remote-host glusterfs-06 61: option remote-subvolume /srv/glusterfs/brick/archive 62: option transport-type socket 63: option transport.address-family inet 64: option transport.tcp-user-timeout 0 65: option transport.socket.keepalive-time 20 66: option transport.socket.keepalive-interval 2 67: option transport.socket.keepalive-count 9 68: option send-gids true 69: end-volume 70: 71: volume archive-client-5 72: type protocol/client 73: option ping-timeout 42 74: option remote-host glusterfs-05 75: option remote-subvolume /srv/glusterfs/brick/archive 76: option transport-type socket 77: option transport.address-family inet 78: option transport.tcp-user-timeout 0 79: option transport.socket.keepalive-time 20 80: option transport.socket.keepalive-interval 2 81: option transport.socket.keepalive-count 9 82: option send-gids true 83: end-volume 84: 85: volume archive-client-6 86: type protocol/client 87: option ping-timeout 42 88: option remote-host glusterfs-04 89: option remote-subvolume /srv/glusterfs/brick/archive 90: option transport-type socket 91: option transport.address-family inet 92: option transport.tcp-user-timeout 0 93: option transport.socket.keepalive-time 20 94: option transport.socket.keepalive-interval 2 95: option transport.socket.keepalive-count 9 96: option send-gids true 97: end-volume 98: 99: volume archive-client-7 100: type protocol/client 101: option ping-timeout 42 102: option remote-host glusterfs-03 103: option remote-subvolume /srv/glusterfs/brick/archive 104: option transport-type socket 105: option transport.address-family inet 106: option transport.tcp-user-timeout 0 107: option transport.socket.keepalive-time 20 108: option transport.socket.keepalive-interval 2 109: option transport.socket.keepalive-count 9 110: option send-gids true 111: end-volume 112: 113: volume archive-client-8 114: type protocol/client 115: option ping-timeout 42 116: option remote-host glusterfs-02 117: option remote-subvolume /srv/glusterfs/brick/archive 118: option transport-type socket 119: option transport.address-family inet 120: option transport.tcp-user-timeout 0 121: option transport.socket.keepalive-time 20 122: option transport.socket.keepalive-interval 2 123: option transport.socket.keepalive-count 9 124: option send-gids true 125: end-volume 126: 127: volume archive-client-9 128: type protocol/client 129: option ping-timeout 42 130: option remote-host glusterfs-01 131: option remote-subvolume /srv/glusterfs/brick/archive 132: option transport-type socket 133: option transport.address-family inet 134: option transport.tcp-user-timeout 0 135: option transport.socket.keepalive-time 20 136: option transport.socket.keepalive-interval 2 137: option transport.socket.keepalive-count 9 138: option send-gids true 139: end-volume 140: 141: volume archive-disperse-0 142: type cluster/disperse 143: option redundancy 2 144: subvolumes archive-client-0 archive-client-1 archive-client-2 archive-client-3 archive-client-4 archive-client-5 archive-client-6 archive-client-7 archive-client-8 archive-client-9 145: end-volume 146: 147: volume archive-dht 148: type cluster/distribute 149: option lock-migration off 150: subvolumes archive-disperse-0 151: end-volume 152: 153: volume archive-write-behind 154: type performance/write-behind 155: subvolumes archive-dht 156: end-volume 157: 158: volume archive-read-ahead 159: type performance/read-ahead 160: subvolumes archive-write-behind 161: end-volume 162: 163: volume archive-readdir-ahead 164: type performance/readdir-ahead 165: option parallel-readdir off 166: option rda-request-size 131072 167: option rda-cache-limit 10MB 168: subvolumes archive-read-ahead 169: end-volume 170: 171: volume archive-io-cache 172: type performance/io-cache 173: subvolumes archive-readdir-ahead 174: end-volume 175: 176: volume archive-quick-read 177: type performance/quick-read 178: subvolumes archive-io-cache 179: end-volume 180: 181: volume archive-open-behind 182: type performance/open-behind 183: subvolumes archive-quick-read 184: end-volume 185: 186: volume archive-md-cache 187: type performance/md-cache 188: option cache-posix-acl true 189: subvolumes archive-open-behind 190: end-volume 191: 192: volume archive-io-threads 193: type performance/io-threads 194: subvolumes archive-md-cache 195: end-volume 196: 197: volume archive 198: type debug/io-stats 199: option log-level INFO 200: option latency-measurement off 201: option count-fop-hits off 202: subvolumes archive-io-threads 203: end-volume 204: 205: volume meta-autoload 206: type meta 207: subvolumes archive 208: end-volume 209: +------------------------------------------------------------------------------+ [2017-12-05 10:20:13.973638] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-4: changing port to 49153 (from 0) [2017-12-05 10:20:13.973683] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-1: changing port to 49153 (from 0) [2017-12-05 10:20:13.974177] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-6: changing port to 49153 (from 0) [2017-12-05 10:20:13.974236] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-2: changing port to 49153 (from 0) [2017-12-05 10:20:13.974818] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-5: changing port to 49153 (from 0) [2017-12-05 10:20:13.974862] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.974877] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-7: changing port to 49153 (from 0) [2017-12-05 10:20:13.974925] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.974936] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-9: changing port to 49153 (from 0) [2017-12-05 10:20:13.975363] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-8: changing port to 49153 (from 0) [2017-12-05 10:20:13.975458] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-4: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.975482] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.975980] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-6: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976005] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976211] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-0: Connected to archive-client-0, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976244] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-0: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.976299] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-3: Connected to archive-client-3, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976314] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-3: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.976492] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976501] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-7: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976599] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976628] I [MSGID: 114057] [client-handshake.c:1478:select_server_supported_programs] 0-archive-client-8: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-12-05 10:20:13.976792] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-0: Server lk version = 1 [2017-12-05 10:20:13.976806] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-3: Server lk version = 1 [2017-12-05 10:20:13.976941] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-6: Connected to archive-client-6, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976953] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-6: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.976972] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-1: Connected to archive-client-1, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976806] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-3: Server lk version = 1 [2017-12-05 10:20:13.976941] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-6: Connected to archive-client-6, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976953] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-6: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.976972] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-1: Connected to archive-client-1, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.976986] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-1: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.977015] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-2: Connected to archive-client-2, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.977025] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-2: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.977318] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-6: Server lk version = 1 [2017-12-05 10:20:13.977411] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-2: Server lk version = 1 [2017-12-05 10:20:13.977427] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-1: Server lk version = 1 [2017-12-05 10:20:13.977453] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-8: Connected to archive-client-8, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.977463] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-8: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.977637] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-7: Connected to archive-client-7, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.977647] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-7: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.977815] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-8: Server lk version = 1 [2017-12-05 10:20:13.977993] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-7: Server lk version = 1 [2017-12-05 10:20:13.978005] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-9: Connected to archive-client-9, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.978017] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-9: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.978364] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-9: Server lk version = 1 [2017-12-05 10:20:13.979511] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-5: Connected to archive-client-5, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.979545] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-5: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.980131] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-5: Server lk version = 1 [2017-12-05 10:20:13.990225] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-4: Connected to archive-client-4, attached to remote volume '/srv/glusterfs/brick/archive'. [2017-12-05 10:20:13.990267] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-4: Server and Client lk-version numbers are not same, reopening the fds [2017-12-05 10:20:13.990350] I [MSGID: 122061] [ec.c:344:ec_up] 0-archive-disperse-0: Going UP [2017-12-05 10:20:13.990942] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-4: Server lk version = 1 [2017-12-05 10:20:13.996048] I [MSGID: 104041] [glfs-resolve.c:971:__glfs_active_subvol] 0-archive: switched to graph 70656c6b-6d61-6e73-6c61-622d73616d62 (0)
Keep in mind a local disk is 3,6,12 Gbps but a network connection is typically 1Gbps. A local disk quad in raid 10 will outperform a 10G ethernet (especially using SAS drives). On December 5, 2017 6:11:38 AM EST, Riccardo Murri <riccardo.murri at uzh.ch> wrote:>Hello, > >I'm trying to set up a SAMBA server serving a GlusterFS volume. >Everything works fine if I locally mount the GlusterFS volume (`mount >-t glusterfs ...`) and then serve the mounted FS through SAMBA, but >the performance is slower by a 2x/3x compared to a SAMBA server with a >local ext4 filesystem. > >I gather that SAMBA vfs_glusterfs module can give better >performance. However, as soon as I switch SMBD to using GFAPI (`vfs >objects = glusterfs`), the SMBD process crashes leaving a backtrace in >the logs. I can see no errors in the associated GlusterFS log file >and everything runs fine when locally-mounting. I am copying below >the share definition, SMBD crash dump from the logs, and the GlusterFS >client logs (produced by the `vfs_glusterfs.so` module in SMBD). > >I'm using Ubuntu 16.04.3 with samba package >2:4.3.11+dfsg-0ubuntu0.16.04.12 (I locally recompiled >`samba-vfs-modules` to enable the GlusterFS VFS which is by default >disabled in Ubuntu builds). > >Any suggestions on what/where to look? > >Alternatively: how to improve the performance of SAMBA over >locally-mounted GlusterFS? right now, I'm getting a 2x to 4x slowdown >compared to SAMBA with local disk. > >Thanks for any help! > >Riccardo > > >### share definition from /etc/samba/smb.conf > >[archive] > browsable =yes > writable = yes > create mask = 0775 > directory mask = 0775 > > guest ok = yes > > vfs objects = glusterfs > kernel share modes = No > > glusterfs:loglevel = 7 > glusterfs:logfile = /var/log/samba/glusterfs-vol-archive.log > glusterfs:volfile_server = glusterfs-01 > glusterfs:volume = archive > path = / > > >### SMBD crash from SAMBA logs > >Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.996199, 0] >../source3/modules/vfs_glusterfs.c:258(vfs_gluster_connect) >Dec 5 10:20:13 samba-04 smbd[31932]: archive: Initialized volume >from server glusterfs-01 >Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.999371, 0] >../lib/util/fault.c:78(fault_report) >Dec 5 10:20:13 samba-04 smbd[31932]: >==============================================================>Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000042, 0] >../lib/util/fault.c:79(fault_report) >Dec 5 10:20:14 samba-04 smbd[31932]: INTERNAL ERROR: Signal 6 in pid >31932 (4.3.11-Ubuntu) >Dec 5 10:20:14 samba-04 smbd[31932]: Please read the >Trouble-Shooting section of the Samba HOWTO >Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000594, 0] >../lib/util/fault.c:81(fault_report) >Dec 5 10:20:14 samba-04 smbd[31932]: >==============================================================>Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000954, 0] >../source3/lib/util.c:789(smb_panic_s3) >Dec 5 10:20:14 samba-04 smbd[31932]: PANIC (pid 31932): internal >error >Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.001712, 0] >../source3/lib/util.c:900(log_stack_trace) >Dec 5 10:20:14 samba-04 smbd[31932]: BACKTRACE: 29 stack frames: >Dec 5 10:20:14 samba-04 smbd[31932]: #0 >/usr/lib/x86_64-linux-gnu/samba/libsmbregistry.so.0(log_stack_trace+0x1a) >[0x7f5e8b5427aa] >Dec 5 10:20:14 samba-04 smbd[31932]: #1 >/usr/lib/x86_64-linux-gnu/samba/libsmbregistry.so.0(smb_panic_s3+0x20) >[0x7f5e8b542880] >Dec 5 10:20:14 samba-04 smbd[31932]: #2 >/usr/lib/x86_64-linux-gnu/libsamba-util.so.0(smb_panic+0x2f) >[0x7f5e8c2b5f1f] >Dec 5 10:20:14 samba-04 smbd[31932]: #3 >/usr/lib/x86_64-linux-gnu/libsamba-util.so.0(+0x1b136) [0x7f5e8c2b6136] >Dec 5 10:20:14 samba-04 smbd[31932]: #4 >/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f5e8c514390] >Dec 5 10:20:14 samba-04 smbd[31932]: #5 >/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7f5e88a87428] >Dec 5 10:20:14 samba-04 smbd[31932]: #6 >/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f5e88a8902a] >Dec 5 10:20:14 samba-04 smbd[31932]: #7 >/lib/x86_64-linux-gnu/libc.so.6(+0x777ea) [0x7f5e88ac97ea] >Dec 5 10:20:14 samba-04 smbd[31932]: #8 >/lib/x86_64-linux-gnu/libc.so.6(+0x8037a) [0x7f5e88ad237a] >Dec 5 10:20:14 samba-04 smbd[31932]: #9 >/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c) [0x7f5e88ad653c] >Dec 5 10:20:14 samba-04 smbd[31932]: #10 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x117955) >[0x7f5e8be88955] >Dec 5 10:20:14 samba-04 smbd[31932]: #11 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1188b7) >[0x7f5e8be898b7] >Dec 5 10:20:14 samba-04 smbd[31932]: #12 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(make_connection_smb2+0x62) >[0x7f5e8be8a662] >Dec 5 10:20:14 samba-04 smbd[31932]: #13 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_process_tcon+0x6ad) >[0x7f5e8be9e48d] >Dec 5 10:20:14 samba-04 smbd[31932]: #14 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_smb2_request_dispatch+0x974) >[0x7f5e8be98bd4] >Dec 5 10:20:14 samba-04 smbd[31932]: #15 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(+0x1287fb) >[0x7f5e8be997fb] >Dec 5 10:20:14 samba-04 smbd[31932]: #16 >/usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x167) >[0x7f5e8a1d5917] >Dec 5 10:20:14 samba-04 smbd[31932]: #17 >/usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) [0x7f5e8a1d5b77] >Dec 5 10:20:14 samba-04 smbd[31932]: #18 >/usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x8d) >[0x7f5e88e1fd3d] >Dec 5 10:20:14 samba-04 smbd[31932]: #19 >/usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) >[0x7f5e88e1fedb] >Dec 5 10:20:14 samba-04 smbd[31932]: #20 >/usr/lib/x86_64-linux-gnu/samba/libsmbd-base.so.0(smbd_process+0x718) >[0x7f5e8be88578] >Dec 5 10:20:14 samba-04 smbd[31932]: #21 /usr/sbin/smbd(+0x8e12) >[0x55ba26c51e12] >Dec 5 10:20:14 samba-04 smbd[31932]: #22 >/usr/lib/x86_64-linux-gnu/libsmbconf.so.0(run_events_poll+0x167) >[0x7f5e8a1d5917] >Dec 5 10:20:14 samba-04 smbd[31932]: #23 >/usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) [0x7f5e8a1d5b77] >Dec 5 10:20:14 samba-04 smbd[31932]: #24 >/usr/lib/x86_64-linux-gnu/libtevent.so.0(_tevent_loop_once+0x8d) >[0x7f5e88e1fd3d] >Dec 5 10:20:14 samba-04 smbd[31932]: #25 >/usr/lib/x86_64-linux-gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) >[0x7f5e88e1fedb] >Dec 5 10:20:14 samba-04 smbd[31932]: #26 >/usr/sbin/smbd(main+0x1899) [0x55ba26c50099] >Dec 5 10:20:14 samba-04 smbd[31932]: #27 >/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) >[0x7f5e88a72830] >Dec 5 10:20:14 samba-04 smbd[31932]: #28 >/usr/sbin/smbd(_start+0x29) [0x55ba26c50199] >Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.006804, 0] >../source3/lib/dumpcore.c:303(dump_core) >Dec 5 10:20:14 samba-04 smbd[31932]: dumping core in >/var/log/samba/cores/smbd >Dec 5 10:20:14 samba-04 smbd[31932]: >Dec 5 10:20:14 samba-04 smbd[31944]: [2017/12/05 10:20:14.213971, 0] >../source3/smbd/smb2_server.c:547(smbd_smb2_request_create) >Dec 5 10:20:14 samba-04 smbd[31944]: Invalid SMB packet: first >request: 0x0003 > >### GlusterFS client logs (issued from the `vfs_gluster.so` module >within `smbd`) > >[2017-12-05 10:20:13.949861] W [MSGID: 101002] >[options.c:995:xl_opt_validate] 0-gfapi: option 'address-family' is >deprecated, preferred is 'transport.address-family', continuing with >correction >[2017-12-05 10:20:13.965822] I [MSGID: 122067] >[ec-code.c:1046:ec_code_detect] 0-archive-disperse-0: Using 'avx' CPU >extensions >[2017-12-05 10:20:13.969808] W [MSGID: 101174] >[graph.c:363:_log_if_unknown_option] 0-archive-readdir-ahead: option >'parallel-readdir' is not recognized >[2017-12-05 10:20:13.970295] I [MSGID: 104045] >[glfs-master.c:91:notify] 0-gfapi: New graph >70656c6b-6d61-6e73-6c61-622d73616d62 (0) coming up >[2017-12-05 10:20:13.970352] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-0: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.970718] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-1: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971011] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-2: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971205] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-3: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971401] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-4: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971576] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-5: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971770] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-6: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.971985] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-7: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.972254] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-8: parent translators are ready, attempting connect on >transport >[2017-12-05 10:20:13.972495] I [MSGID: 114020] [client.c:2360:notify] >0-archive-client-9: parent translators are ready, attempting connect on >transport >Final graph: >+------------------------------------------------------------------------------+ > 1: volume archive-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host glusterfs-10 > 5: option remote-subvolume /srv/glusterfs/brick/archive > 6: option transport-type socket > 7: option transport.address-family inet > 8: option transport.tcp-user-timeout 0 > 9: option transport.socket.keepalive-time 20 > 10: option transport.socket.keepalive-interval 2 > 11: option transport.socket.keepalive-count 9 > 12: option send-gids true > 13: end-volume > 14: > 15: volume archive-client-1 > 16: type protocol/client > 17: option ping-timeout 42 > 18: option remote-host glusterfs-09 > 19: option remote-subvolume /srv/glusterfs/brick/archive > 20: option transport-type socket > 21: option transport.address-family inet > 22: option transport.tcp-user-timeout 0 > 23: option transport.socket.keepalive-time 20 > 24: option transport.socket.keepalive-interval 2 > 25: option transport.socket.keepalive-count 9 > 26: option send-gids true > 27: end-volume > 28: > 28: > 29: volume archive-client-2 > 30: type protocol/client > 31: option ping-timeout 42 >[2017-12-05 10:20:13.972846] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-0: changing port to 49153 (from 0) > 32: option remote-host glusterfs-08 > 33: option remote-subvolume /srv/glusterfs/brick/archive > 34: option transport-type socket > 35: option transport.address-family inet > 36: option transport.tcp-user-timeout 0 > 37: option transport.socket.keepalive-time 20 > 38: option transport.socket.keepalive-interval 2 > 39: option transport.socket.keepalive-count 9 > 40: option send-gids true > 41: end-volume > 42: > 43: volume archive-client-3 > 44: type protocol/client > 45: option ping-timeout 42 > 46: option remote-host glusterfs-07 > 47: option remote-subvolume /srv/glusterfs/brick/archive > 48: option transport-type socket > 49: option transport.address-family inet > 50: option transport.tcp-user-timeout 0 >[2017-12-05 10:20:13.972929] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-3: changing port to 49153 (from 0) > 51: option transport.socket.keepalive-time 20 > 52: option transport.socket.keepalive-interval 2 > 53: option transport.socket.keepalive-count 9 > 54: option send-gids true > 55: end-volume > 56: > 57: volume archive-client-4 > 58: type protocol/client > 59: option ping-timeout 42 > 60: option remote-host glusterfs-06 > 61: option remote-subvolume /srv/glusterfs/brick/archive > 62: option transport-type socket > 63: option transport.address-family inet > 64: option transport.tcp-user-timeout 0 > 65: option transport.socket.keepalive-time 20 > 66: option transport.socket.keepalive-interval 2 > 67: option transport.socket.keepalive-count 9 > 68: option send-gids true > 69: end-volume > 70: > 71: volume archive-client-5 > 72: type protocol/client > 73: option ping-timeout 42 > 74: option remote-host glusterfs-05 > 75: option remote-subvolume /srv/glusterfs/brick/archive > 76: option transport-type socket > 77: option transport.address-family inet > 78: option transport.tcp-user-timeout 0 > 79: option transport.socket.keepalive-time 20 > 80: option transport.socket.keepalive-interval 2 > 81: option transport.socket.keepalive-count 9 > 82: option send-gids true > 83: end-volume > 84: > 85: volume archive-client-6 > 86: type protocol/client > 87: option ping-timeout 42 > 88: option remote-host glusterfs-04 > 89: option remote-subvolume /srv/glusterfs/brick/archive > 90: option transport-type socket > 91: option transport.address-family inet > 92: option transport.tcp-user-timeout 0 > 93: option transport.socket.keepalive-time 20 > 94: option transport.socket.keepalive-interval 2 > 95: option transport.socket.keepalive-count 9 > 96: option send-gids true > 97: end-volume > 98: > 99: volume archive-client-7 >100: type protocol/client >101: option ping-timeout 42 >102: option remote-host glusterfs-03 >103: option remote-subvolume /srv/glusterfs/brick/archive >104: option transport-type socket >105: option transport.address-family inet >106: option transport.tcp-user-timeout 0 >107: option transport.socket.keepalive-time 20 >108: option transport.socket.keepalive-interval 2 >109: option transport.socket.keepalive-count 9 >110: option send-gids true >111: end-volume >112: >113: volume archive-client-8 >114: type protocol/client >115: option ping-timeout 42 >116: option remote-host glusterfs-02 >117: option remote-subvolume /srv/glusterfs/brick/archive >118: option transport-type socket >119: option transport.address-family inet >120: option transport.tcp-user-timeout 0 >121: option transport.socket.keepalive-time 20 >122: option transport.socket.keepalive-interval 2 >123: option transport.socket.keepalive-count 9 >124: option send-gids true >125: end-volume >126: >127: volume archive-client-9 >128: type protocol/client >129: option ping-timeout 42 >130: option remote-host glusterfs-01 >131: option remote-subvolume /srv/glusterfs/brick/archive >132: option transport-type socket >133: option transport.address-family inet >134: option transport.tcp-user-timeout 0 >135: option transport.socket.keepalive-time 20 >136: option transport.socket.keepalive-interval 2 >137: option transport.socket.keepalive-count 9 >138: option send-gids true >139: end-volume >140: >141: volume archive-disperse-0 >142: type cluster/disperse >143: option redundancy 2 >144: subvolumes archive-client-0 archive-client-1 archive-client-2 >archive-client-3 archive-client-4 archive-client-5 archive-client-6 >archive-client-7 archive-client-8 archive-client-9 >145: end-volume >146: >147: volume archive-dht >148: type cluster/distribute >149: option lock-migration off >150: subvolumes archive-disperse-0 >151: end-volume >152: >153: volume archive-write-behind >154: type performance/write-behind >155: subvolumes archive-dht >156: end-volume >157: >158: volume archive-read-ahead >159: type performance/read-ahead >160: subvolumes archive-write-behind >161: end-volume >162: >163: volume archive-readdir-ahead >164: type performance/readdir-ahead >165: option parallel-readdir off >166: option rda-request-size 131072 >167: option rda-cache-limit 10MB >168: subvolumes archive-read-ahead >169: end-volume >170: >171: volume archive-io-cache >172: type performance/io-cache >173: subvolumes archive-readdir-ahead >174: end-volume >175: >176: volume archive-quick-read >177: type performance/quick-read >178: subvolumes archive-io-cache >179: end-volume >180: >181: volume archive-open-behind >182: type performance/open-behind >183: subvolumes archive-quick-read >184: end-volume >185: >186: volume archive-md-cache >187: type performance/md-cache >188: option cache-posix-acl true >189: subvolumes archive-open-behind >190: end-volume >191: >192: volume archive-io-threads >193: type performance/io-threads >194: subvolumes archive-md-cache >195: end-volume >196: >197: volume archive >198: type debug/io-stats >199: option log-level INFO >200: option latency-measurement off >201: option count-fop-hits off >202: subvolumes archive-io-threads >203: end-volume >204: >205: volume meta-autoload >206: type meta >207: subvolumes archive >208: end-volume >209: >+------------------------------------------------------------------------------+ >[2017-12-05 10:20:13.973638] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-4: changing port to 49153 (from 0) >[2017-12-05 10:20:13.973683] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-1: changing port to 49153 (from 0) >[2017-12-05 10:20:13.974177] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-6: changing port to 49153 (from 0) >[2017-12-05 10:20:13.974236] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-2: changing port to 49153 (from 0) >[2017-12-05 10:20:13.974818] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-5: changing port to 49153 (from 0) >[2017-12-05 10:20:13.974862] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-0: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.974877] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-7: changing port to 49153 (from 0) >[2017-12-05 10:20:13.974925] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-3: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.974936] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-9: changing port to 49153 (from 0) >[2017-12-05 10:20:13.975363] I [rpc-clnt.c:1986:rpc_clnt_reconfig] >0-archive-client-8: changing port to 49153 (from 0) >[2017-12-05 10:20:13.975458] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-4: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.975482] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-1: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.975980] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-6: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976005] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-2: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976211] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-0: >Connected to archive-client-0, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976244] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-0: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.976299] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-3: >Connected to archive-client-3, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976314] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-3: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.976492] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-5: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976501] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-7: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976599] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-9: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976628] I [MSGID: 114057] >[client-handshake.c:1478:select_server_supported_programs] >0-archive-client-8: Using Program GlusterFS 3.3, Num (1298437), Version >(330) >[2017-12-05 10:20:13.976792] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-0: >Server lk version = 1 >[2017-12-05 10:20:13.976806] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-3: >Server lk version = 1 >[2017-12-05 10:20:13.976941] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-6: >Connected to archive-client-6, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976953] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-6: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.976972] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-1: >Connected to archive-client-1, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976806] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-3: >Server lk version = 1 >[2017-12-05 10:20:13.976941] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-6: >Connected to archive-client-6, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976953] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-6: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.976972] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-1: >Connected to archive-client-1, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.976986] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-1: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.977015] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-2: >Connected to archive-client-2, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.977025] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-2: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.977318] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-6: >Server lk version = 1 >[2017-12-05 10:20:13.977411] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-2: >Server lk version = 1 >[2017-12-05 10:20:13.977427] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-1: >Server lk version = 1 >[2017-12-05 10:20:13.977453] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-8: >Connected to archive-client-8, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.977463] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-8: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.977637] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-7: >Connected to archive-client-7, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.977647] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-7: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.977815] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-8: >Server lk version = 1 >[2017-12-05 10:20:13.977993] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-7: >Server lk version = 1 >[2017-12-05 10:20:13.978005] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-9: >Connected to archive-client-9, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.978017] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-9: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.978364] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-9: >Server lk version = 1 >[2017-12-05 10:20:13.979511] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-5: >Connected to archive-client-5, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.979545] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-5: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.980131] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-5: >Server lk version = 1 >[2017-12-05 10:20:13.990225] I [MSGID: 114046] >[client-handshake.c:1231:client_setvolume_cbk] 0-archive-client-4: >Connected to archive-client-4, attached to remote volume >'/srv/glusterfs/brick/archive'. >[2017-12-05 10:20:13.990267] I [MSGID: 114047] >[client-handshake.c:1242:client_setvolume_cbk] 0-archive-client-4: >Server and Client lk-version numbers are not same, reopening the fds >[2017-12-05 10:20:13.990350] I [MSGID: 122061] [ec.c:344:ec_up] >0-archive-disperse-0: Going UP >[2017-12-05 10:20:13.990942] I [MSGID: 114035] >[client-handshake.c:202:client_set_lk_version_cbk] 0-archive-client-4: >Server lk version = 1 >[2017-12-05 10:20:13.996048] I [MSGID: 104041] >[glfs-resolve.c:971:__glfs_active_subvol] 0-archive: switched to graph >70656c6b-6d61-6e73-6c61-622d73616d62 (0) > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://lists.gluster.org/mailman/listinfo/gluster-users-- Sent from my Android device with K-9 Mail. All tyopes are thumb related and reflect authenticity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171205/b24b83a2/attachment.html>
Riccardo Murri
2017-Dec-05 15:53 UTC
[Gluster-users] SAMBA VFS module for GlusterFS crashes
Hi John, thanks for your remark. However: 2017-12-05 16:47 GMT+01:00 Jim Kinney <jim.kinney at gmail.com>:> Keep in mind a local disk is 3,6,12 Gbps but a network connection is > typically 1Gbps. A local disk quad in raid 10 will outperform a 10G ethernet > (especially using SAS drives).Well, in this case all servers (both GlusterFS and SAMBA) are running on VMs whose storage is in a Ceph cluster -- so "local disk" really means an RBD volume over 10Gb/s ethernet... (I know that it makes little sense to run GlusterFS over Ceph storage, but that's what we have and I cannot serve RBD volumes from Ceph directly unless they're attached to some VM...) Ciao, R
On Tue, 2017-12-05 at 11:11 +0000, Riccardo Murri wrote:> Hello, > > I'm trying to set up a SAMBA server serving a GlusterFS volume. > Everything works fine if I locally mount the GlusterFS volume (`mount > -t glusterfs ...`) and then serve the mounted FS through SAMBA, but > the performance is slower by a 2x/3x compared to a SAMBA server with a > local ext4 filesystem. > > I gather that SAMBA vfs_glusterfs module can give better > performance. However, as soon as I switch SMBD to using GFAPI (`vfs > objects = glusterfs`), the SMBD process crashes leaving a backtrace in > the logs. I can see no errors in the associated GlusterFS log file > and everything runs fine when locally-mounting. I am copying below > the share definition, SMBD crash dump from the logs, and the GlusterFS > client logs (produced by the `vfs_glusterfs.so` module in SMBD). > > I'm using Ubuntu 16.04.3 with samba package > 2:4.3.11+dfsg-0ubuntu0.16.04.12 (I locally recompiled > `samba-vfs-modules` to enable the GlusterFS VFS which is by default > disabled in Ubuntu builds). > > Any suggestions on what/where to look? > > Alternatively: how to improve the performance of SAMBA over > locally-mounted GlusterFS? right now, I'm getting a 2x to 4x slowdown > compared to SAMBA with local disk. > > Thanks for any help! > > Riccardo > > > ### share definition from /etc/samba/smb.conf > > ba[archive] > browsable =yes > writable = yes > create mask = 0775 > directory mask = 0775 > > guest ok = yes > > vfs objects = glusterfs > kernel share modes = No > > glusterfs:loglevel = 7 > glusterfs:logfile = /var/log/samba/glusterfs-vol-archive.log > glusterfs:volfile_server = glusterfs-01 > glusterfs:volume = archive > path = / > > > ### SMBD crash from SAMBA logs > > Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.996199, 0] > ../source3/modules/vfs_glusterfs.c:258(vfs_gluster_connect) > Dec 5 10:20:13 samba-04 smbd[31932]: archive: Initialized volume from server glusterfs-01 > Dec 5 10:20:13 samba-04 smbd[31932]: [2017/12/05 10:20:13.999371, 0] > ../lib/util/fault.c:78(fault_report) > Dec 5 10:20:13 samba-04 > smbd[31932]: ==============================================================> Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000042, 0] > ../lib/util/fault.c:79(fault_report) > Dec 5 10:20:14 samba-04 smbd[31932]: INTERNAL ERROR: Signal 6 in pid 31932 (4.3.11-Ubuntu) > Dec 5 10:20:14 samba-04 smbd[31932]: Please read the Trouble-Shooting section of the Samba > HOWTO > Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000594, 0] > ../lib/util/fault.c:81(fault_report) > Dec 5 10:20:14 samba-04 > smbd[31932]: ==============================================================> Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.000954, 0] > ../source3/lib/util.c:789(smb_panic_s3) > Dec 5 10:20:14 samba-04 smbd[31932]: PANIC (pid 31932): internal error > Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.001712, 0] > ../source3/lib/util.c:900(log_stack_trace) > Dec 5 10:20:14 samba-04 smbd[31932]: BACKTRACE: 29 stack frames: > Dec 5 10:20:14 samba-04 smbd[31932]: #0 /usr/lib/x86_64-linux- > gnu/samba/libsmbregistry.so.0(log_stack_trace+0x1a) [0x7f5e8b5427aa] > Dec 5 10:20:14 samba-04 smbd[31932]: #1 /usr/lib/x86_64-linux- > gnu/samba/libsmbregistry.so.0(smb_panic_s3+0x20) [0x7f5e8b542880] > Dec 5 10:20:14 samba-04 smbd[31932]: #2 /usr/lib/x86_64-linux-gnu/libsamba- > util.so.0(smb_panic+0x2f) [0x7f5e8c2b5f1f] > Dec 5 10:20:14 samba-04 smbd[31932]: #3 /usr/lib/x86_64-linux-gnu/libsamba-util.so.0(+0x1b136) > [0x7f5e8c2b6136] > Dec 5 10:20:14 samba-04 smbd[31932]: #4 /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) > [0x7f5e8c514390] > Dec 5 10:20:14 samba-04 smbd[31932]: #5 /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) > [0x7f5e88a87428] > Dec 5 10:20:14 samba-04 smbd[31932]: #6 /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) > [0x7f5e88a8902a] > Dec 5 10:20:14 samba-04 smbd[31932]: #7 /lib/x86_64-linux-gnu/libc.so.6(+0x777ea) > [0x7f5e88ac97ea] > Dec 5 10:20:14 samba-04 smbd[31932]: #8 /lib/x86_64-linux-gnu/libc.so.6(+0x8037a) > [0x7f5e88ad237a] > Dec 5 10:20:14 samba-04 smbd[31932]: #9 /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c) > [0x7f5e88ad653c]I think you are hitting a bug[1] from vfs module for GlusterFS inside Samba during a realpath() call. This regression got in when glfs_realpath() was modified in GlusterFS[2] to correctly handle memory allocation and corresponding freeing of string arguments. And this particular change is present from GlusterFS version 3.7.17 in 3.7 series, in 3.9 series and all above versions (but not present in 3.8 series). So if you are using any of the above mentioned GlusterFS versions(which contains the above change for glfs_realpath()), then you are recommended to install Samba version >= 4.4.9 in 4.4 series, >4.5.2 or any versions from higher series like 4.6, 4.7 etc which contains the fix for this crash. We had this mentioned in release notes for GlusterFS v3.7.17[3]. I would suggest you to upgrade both Samba and GlusterFS to supported versions as per [4] and [5]. [1] https://bugzilla.samba.org/show_bug.cgi?id=12404 [2] https://review.gluster.org/#/c/15332/ [3] https://github.com/gluster/glusterfs/blob/release-3.7/doc/release-notes/3.7.17.md [4] https://wiki.samba.org/index.php/Samba_Release_Planning [5] https://www.gluster.org/release-schedule/> Dec 5 10:20:14 samba-04 smbd[31932]: #10 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(+0x117955) [0x7f5e8be88955] > Dec 5 10:20:14 samba-04 smbd[31932]: #11 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(+0x1188b7) [0x7f5e8be898b7] > Dec 5 10:20:14 samba-04 smbd[31932]: #12 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(make_connection_smb2+0x62) [0x7f5e8be8a662] > Dec 5 10:20:14 samba-04 smbd[31932]: #13 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(smbd_smb2_request_process_tcon+0x6ad) [0x7f5e8be9e48d] > Dec 5 10:20:14 samba-04 smbd[31932]: #14 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(smbd_smb2_request_dispatch+0x974) [0x7f5e8be98bd4] > Dec 5 10:20:14 samba-04 smbd[31932]: #15 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(+0x1287fb) [0x7f5e8be997fb] > Dec 5 10:20:14 samba-04 smbd[31932]: #16 /usr/lib/x86_64-linux- > gnu/libsmbconf.so.0(run_events_poll+0x167) [0x7f5e8a1d5917] > Dec 5 10:20:14 samba-04 smbd[31932]: #17 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) > [0x7f5e8a1d5b77] > Dec 5 10:20:14 samba-04 smbd[31932]: #18 /usr/lib/x86_64-linux- > gnu/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f5e88e1fd3d] > Dec 5 10:20:14 samba-04 smbd[31932]: #19 /usr/lib/x86_64-linux- > gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f5e88e1fedb] > Dec 5 10:20:14 samba-04 smbd[31932]: #20 /usr/lib/x86_64-linux-gnu/samba/libsmbd- > base.so.0(smbd_process+0x718) [0x7f5e8be88578] > Dec 5 10:20:14 samba-04 smbd[31932]: #21 /usr/sbin/smbd(+0x8e12) [0x55ba26c51e12] > Dec 5 10:20:14 samba-04 smbd[31932]: #22 /usr/lib/x86_64-linux- > gnu/libsmbconf.so.0(run_events_poll+0x167) [0x7f5e8a1d5917] > Dec 5 10:20:14 samba-04 smbd[31932]: #23 /usr/lib/x86_64-linux-gnu/libsmbconf.so.0(+0x2cb77) > [0x7f5e8a1d5b77] > Dec 5 10:20:14 samba-04 smbd[31932]: #24 /usr/lib/x86_64-linux- > gnu/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f5e88e1fd3d] > Dec 5 10:20:14 samba-04 smbd[31932]: #25 /usr/lib/x86_64-linux- > gnu/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f5e88e1fedb] > Dec 5 10:20:14 samba-04 smbd[31932]: #26 /usr/sbin/smbd(main+0x1899) [0x55ba26c50099] > Dec 5 10:20:14 samba-04 smbd[31932]: #27 /lib/x86_64-linux- > gnu/libc.so.6(__libc_start_main+0xf0) [0x7f5e88a72830] > Dec 5 10:20:14 samba-04 smbd[31932]: #28 /usr/sbin/smbd(_start+0x29) [0x55ba26c50199] > Dec 5 10:20:14 samba-04 smbd[31932]: [2017/12/05 10:20:14.006804, 0] > ../source3/lib/dumpcore.c:303(dump_core) > Dec 5 10:20:14 samba-04 smbd[31932]: dumping core in /var/log/samba/cores/smbd > Dec 5 10:20:14 samba-04 smbd[31932]: > Dec 5 10:20:14 samba-04 smbd[31944]: [2017/12/05 10:20:14.213971, 0] > ../source3/smbd/smb2_server.c:547(smbd_smb2_request_create) > Dec 5 10:20:14 samba-04 smbd[31944]: Invalid SMB packet: first request: 0x0003 > > ### GlusterFS client logs (issued from the `vfs_gluster.so` module within `smbd`) > > [2017-12-05 10:20:13.949861] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-gfapi: option > 'address-family' is deprecated, preferred is 'transport.address-family', continuing with > correction > [2017-12-05 10:20:13.965822] I [MSGID: 122067] [ec-code.c:1046:ec_code_detect] 0-archive-disperse- > 0: Using 'avx' CPU extensions > [2017-12-05 10:20:13.969808] W [MSGID: 101174] [graph.c:363:_log_if_unknown_option] 0-archive- > readdir-ahead: option 'parallel-readdir' is not recognized > [2017-12-05 10:20:13.970295] I [MSGID: 104045] [glfs-master.c:91:notify] 0-gfapi: New graph > 70656c6b-6d61-6e73-6c61-622d73616d62 (0) coming up > [2017-12-05 10:20:13.970352] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-0: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.970718] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-1: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971011] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-2: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971205] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-3: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971401] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-4: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971576] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-5: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971770] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-6: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.971985] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-7: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.972254] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-8: parent > translators are ready, attempting connect on transport > [2017-12-05 10:20:13.972495] I [MSGID: 114020] [client.c:2360:notify] 0-archive-client-9: parent > translators are ready, attempting connect on transport > Final graph: > +------------------------------------------------------------------------------+ > 1: volume archive-client-0 > 2: type protocol/client > 3: option ping-timeout 42 > 4: option remote-host glusterfs-10 > 5: option remote-subvolume /srv/glusterfs/brick/archive > 6: option transport-type socket > 7: option transport.address-family inet > 8: option transport.tcp-user-timeout 0 > 9: option transport.socket.keepalive-time 20 > 10: option transport.socket.keepalive-interval 2 > 11: option transport.socket.keepalive-count 9 > 12: option send-gids true > 13: end-volume > 14: > 15: volume archive-client-1 > 16: type protocol/client > 17: option ping-timeout 42 > 18: option remote-host glusterfs-09 > 19: option remote-subvolume /srv/glusterfs/brick/archive > 20: option transport-type socket > 21: option transport.address-family inet > 22: option transport.tcp-user-timeout 0 > 23: option transport.socket.keepalive-time 20 > 24: option transport.socket.keepalive-interval 2 > 25: option transport.socket.keepalive-count 9 > 26: option send-gids true > 27: end-volume > 28: > 28: > 29: volume archive-client-2 > 30: type protocol/client > 31: option ping-timeout 42 > [2017-12-05 10:20:13.972846] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-0: changing > port to 49153 (from 0) > 32: option remote-host glusterfs-08 > 33: option remote-subvolume /srv/glusterfs/brick/archive > 34: option transport-type socket > 35: option transport.address-family inet > 36: option transport.tcp-user-timeout 0 > 37: option transport.socket.keepalive-time 20 > 38: option transport.socket.keepalive-interval 2 > 39: option transport.socket.keepalive-count 9 > 40: option send-gids true > 41: end-volume > 42: > 43: volume archive-client-3 > 44: type protocol/client > 45: option ping-timeout 42 > 46: option remote-host glusterfs-07 > 47: option remote-subvolume /srv/glusterfs/brick/archive > 48: option transport-type socket > 49: option transport.address-family inet > 50: option transport.tcp-user-timeout 0 > [2017-12-05 10:20:13.972929] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-3: changing > port to 49153 (from 0) > 51: option transport.socket.keepalive-time 20 > 52: option transport.socket.keepalive-interval 2 > 53: option transport.socket.keepalive-count 9 > 54: option send-gids true > 55: end-volume > 56: > 57: volume archive-client-4 > 58: type protocol/client > 59: option ping-timeout 42 > 60: option remote-host glusterfs-06 > 61: option remote-subvolume /srv/glusterfs/brick/archive > 62: option transport-type socket > 63: option transport.address-family inet > 64: option transport.tcp-user-timeout 0 > 65: option transport.socket.keepalive-time 20 > 66: option transport.socket.keepalive-interval 2 > 67: option transport.socket.keepalive-count 9 > 68: option send-gids true > 69: end-volume > 70: > 71: volume archive-client-5 > 72: type protocol/client > 73: option ping-timeout 42 > 74: option remote-host glusterfs-05 > 75: option remote-subvolume /srv/glusterfs/brick/archive > 76: option transport-type socket > 77: option transport.address-family inet > 78: option transport.tcp-user-timeout 0 > 79: option transport.socket.keepalive-time 20 > 80: option transport.socket.keepalive-interval 2 > 81: option transport.socket.keepalive-count 9 > 82: option send-gids true > 83: end-volume > 84: > 85: volume archive-client-6 > 86: type protocol/client > 87: option ping-timeout 42 > 88: option remote-host glusterfs-04 > 89: option remote-subvolume /srv/glusterfs/brick/archive > 90: option transport-type socket > 91: option transport.address-family inet > 92: option transport.tcp-user-timeout 0 > 93: option transport.socket.keepalive-time 20 > 94: option transport.socket.keepalive-interval 2 > 95: option transport.socket.keepalive-count 9 > 96: option send-gids true > 97: end-volume > 98: > 99: volume archive-client-7 > 100: type protocol/client > 101: option ping-timeout 42 > 102: option remote-host glusterfs-03 > 103: option remote-subvolume /srv/glusterfs/brick/archive > 104: option transport-type socket > 105: option transport.address-family inet > 106: option transport.tcp-user-timeout 0 > 107: option transport.socket.keepalive-time 20 > 108: option transport.socket.keepalive-interval 2 > 109: option transport.socket.keepalive-count 9 > 110: option send-gids true > 111: end-volume > 112: > 113: volume archive-client-8 > 114: type protocol/client > 115: option ping-timeout 42 > 116: option remote-host glusterfs-02 > 117: option remote-subvolume /srv/glusterfs/brick/archive > 118: option transport-type socket > 119: option transport.address-family inet > 120: option transport.tcp-user-timeout 0 > 121: option transport.socket.keepalive-time 20 > 122: option transport.socket.keepalive-interval 2 > 123: option transport.socket.keepalive-count 9 > 124: option send-gids true > 125: end-volume > 126: > 127: volume archive-client-9 > 128: type protocol/client > 129: option ping-timeout 42 > 130: option remote-host glusterfs-01 > 131: option remote-subvolume /srv/glusterfs/brick/archive > 132: option transport-type socket > 133: option transport.address-family inet > 134: option transport.tcp-user-timeout 0 > 135: option transport.socket.keepalive-time 20 > 136: option transport.socket.keepalive-interval 2 > 137: option transport.socket.keepalive-count 9 > 138: option send-gids true > 139: end-volume > 140: > 141: volume archive-disperse-0 > 142: type cluster/disperse > 143: option redundancy 2 > 144: subvolumes archive-client-0 archive-client-1 archive-client-2 archive-client-3 archive- > client-4 archive-client-5 archive-client-6 archive-client-7 archive-client-8 archive-client-9 > 145: end-volume > 146: > 147: volume archive-dht > 148: type cluster/distribute > 149: option lock-migration off > 150: subvolumes archive-disperse-0 > 151: end-volume > 152: > 153: volume archive-write-behind > 154: type performance/write-behind > 155: subvolumes archive-dht > 156: end-volume > 157: > 158: volume archive-read-ahead > 159: type performance/read-ahead > 160: subvolumes archive-write-behind > 161: end-volume > 162: > 163: volume archive-readdir-ahead > 164: type performance/readdir-ahead > 165: option parallel-readdir off > 166: option rda-request-size 131072 > 167: option rda-cache-limit 10MB > 168: subvolumes archive-read-ahead > 169: end-volume > 170: > 171: volume archive-io-cache > 172: type performance/io-cache > 173: subvolumes archive-readdir-ahead > 174: end-volume > 175: > 176: volume archive-quick-read > 177: type performance/quick-read > 178: subvolumes archive-io-cache > 179: end-volume > 180: > 181: volume archive-open-behind > 182: type performance/open-behind > 183: subvolumes archive-quick-read > 184: end-volume > 185: > 186: volume archive-md-cache > 187: type performance/md-cache > 188: option cache-posix-acl true > 189: subvolumes archive-open-behind > 190: end-volume > 191: > 192: volume archive-io-threads > 193: type performance/io-threads > 194: subvolumes archive-md-cache > 195: end-volume > 196: > 197: volume archive > 198: type debug/io-stats > 199: option log-level INFO > 200: option latency-measurement off > 201: option count-fop-hits off > 202: subvolumes archive-io-threads > 203: end-volume > 204: > 205: volume meta-autoload > 206: type meta > 207: subvolumes archive > 208: end-volume > 209: > +------------------------------------------------------------------------------+ > [2017-12-05 10:20:13.973638] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-4: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.973683] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-1: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.974177] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-6: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.974236] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-2: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.974818] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-5: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.974862] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-0: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.974877] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-7: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.974925] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-3: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.974936] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-9: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.975363] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-archive-client-8: changing > port to 49153 (from 0) > [2017-12-05 10:20:13.975458] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-4: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.975482] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-1: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.975980] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-6: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976005] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-2: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976211] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-0: Connected to archive-client-0, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976244] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-0: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.976299] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-3: Connected to archive-client-3, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976314] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-3: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.976492] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-5: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976501] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-7: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976599] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-9: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976628] I [MSGID: 114057] [client- > handshake.c:1478:select_server_supported_programs] 0-archive-client-8: Using Program GlusterFS > 3.3, Num (1298437), Version (330) > [2017-12-05 10:20:13.976792] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-0: Server lk version = 1 > [2017-12-05 10:20:13.976806] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-3: Server lk version = 1 > [2017-12-05 10:20:13.976941] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-6: Connected to archive-client-6, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976953] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-6: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.976972] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-1: Connected to archive-client-1, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976806] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-3: Server lk version = 1 > [2017-12-05 10:20:13.976941] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-6: Connected to archive-client-6, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976953] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-6: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.976972] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-1: Connected to archive-client-1, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.976986] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-1: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.977015] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-2: Connected to archive-client-2, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.977025] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-2: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.977318] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-6: Server lk version = 1 > [2017-12-05 10:20:13.977411] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-2: Server lk version = 1 > [2017-12-05 10:20:13.977427] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-1: Server lk version = 1 > [2017-12-05 10:20:13.977453] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-8: Connected to archive-client-8, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.977463] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-8: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.977637] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-7: Connected to archive-client-7, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.977647] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-7: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.977815] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-8: Server lk version = 1 > [2017-12-05 10:20:13.977993] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-7: Server lk version = 1 > [2017-12-05 10:20:13.978005] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-9: Connected to archive-client-9, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.978017] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-9: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.978364] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-9: Server lk version = 1 > [2017-12-05 10:20:13.979511] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-5: Connected to archive-client-5, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.979545] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-5: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.980131] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-5: Server lk version = 1 > [2017-12-05 10:20:13.990225] I [MSGID: 114046] [client-handshake.c:1231:client_setvolume_cbk] 0- > archive-client-4: Connected to archive-client-4, attached to remote volume > '/srv/glusterfs/brick/archive'. > [2017-12-05 10:20:13.990267] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0- > archive-client-4: Server and Client lk-version numbers are not same, reopening the fds > [2017-12-05 10:20:13.990350] I [MSGID: 122061] [ec.c:344:ec_up] 0-archive-disperse-0: Going UP > [2017-12-05 10:20:13.990942] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] > 0-archive-client-4: Server lk version = 1 > [2017-12-05 10:20:13.996048] I [MSGID: 104041] [glfs-resolve.c:971:__glfs_active_subvol] 0- > archive: switched to graph 70656c6b-6d61-6e73-6c61-622d73616d62 (0) > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users
Riccardo Murri
2017-Dec-06 13:57 UTC
[Gluster-users] SAMBA VFS module for GlusterFS crashes
Dear Anoop, thank you very much for your detailed explanation.> I think you are hitting a bug[1] from vfs module for GlusterFS inside Samba during a realpath() > call. > > This regression got in when glfs_realpath() was modified in GlusterFS[2] to correctly handle memory > allocation and corresponding freeing of string arguments. And this particular change is present from > GlusterFS version 3.7.17 in 3.7 series, in 3.9 series and all above versions (but not present in > 3.8 series). > > So if you are using any of the above mentioned GlusterFS versions(which contains the above change > for glfs_realpath()), then you are recommended to install Samba version >= 4.4.9 in 4.4 series, >> 4.5.2 or any versions from higher series like 4.6, 4.7 etc which contains the fix for this crash.I'm using GlusterFS 3.12.3 from the GlusterFS Ubuntu PPA [1]:: $ apt-cache policy glusterfs-client glusterfs-client: Installed: 3.12.3-ubuntu1~xenial1 Candidate: 3.12.3-ubuntu1~xenial1 Version table: *** 3.12.3-ubuntu1~xenial1 500 500 http://ppa.launchpad.net/gluster/glusterfs-3.12/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status 3.7.6-1ubuntu1 500 500 http://nova.clouds.archive.ubuntu.com/ubuntu xenial/universe amd64 Packages 500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages So this should contain the fix already? Or is the fix Samba-side? [1] https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12 Thanks, Riccardo -- Riccardo Murri / Email: riccardo.murri at gmail.com / Tel.: +41 77 458 98 32