Hello, I have two servers, 192.168.0.10 and 192.168.2.10. I'm using gluster 3.6.1 (installed from gluster repo) on AWS Linux. Both servers are completely reachable in LAN. # rpm -qa|grep gluster glusterfs-3.6.1-1.el6.x86_64 glusterfs-server-3.6.1-1.el6.x86_64 glusterfs-libs-3.6.1-1.el6.x86_64 glusterfs-api-3.6.1-1.el6.x86_64 glusterfs-cli-3.6.1-1.el6.x86_64 glusterfs-fuse-3.6.1-1.el6.x86_64 These are the commands I ran: # gluster peer probe 192.168.2.10 # gluster volume create aloha replica 2 transport tcp 192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force # gluster volume start aloha # gluster volume set aloha network.ping-timeout 5 # gluster volume set aloha nfs.disable on Problem number 1: tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows log cluttering with: [2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid argument) this happens every 3 seconds on both servers. It is related to NFS and probably rpcbind, but I absolutely want them disabled. As you see, I've set gluster to disable nfs - why doesn't it keep quiet about it then? Problem number 2: in fstab on server 192.168.0.10: 192.168.0.10:/aloha /var/www/hawaii glusterfs defaults,_netdev 0 0 in fstab on server 192.168.2.10: 192.168.2.10:/aloha /var/www/hawaii glusterfs defaults,_netdev 0 0 If I shutdown one of the servers (192.168.2.10), and I reboot the remaining one (192.168.0.10), it won't come up as fast as it should. It lags a few minutes waiting for gluster. After it eventually starts, mount point is not mounted and volume is stopped: # gluster volume status Status of volume: aloha Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.0.10:/var/aloha N/A N N/A Self-heal Daemon on localhost N/A N N/A Task Status of Volume aloha ------------------------------------------------------------------------------ There are no active volume tasks This didn't happen before, so fine, I first have to stop the volume and then start it again. It now shows as online: Brick 192.168.0.10:/var/aloha 49155 Y 3473 Self-heal Daemon on localhost N/A Y 3507 # time mount -a real 2m7.307s # time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii real 2m7.365s # strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii (attached) # tail /var/log/glusterfs/* -f|grep -v readv (attached) I've done this setup before, so I'm amazed it doesn't work. I even have it in production at the moment, with the same options and setup, and for example I'm not getting readv errors. I'm unable to test the mount part though, but I feel I have covered it way back when I was testing the environment. Any help is kindly appreciated. -------------- next part -------------- execve("/bin/mount", ["mount", "-t", "glusterfs", "192.168.0.10:/aloha", "/var/www/hawaii"], [/* 34 vars */]) = 0 brk(0) = 0x7f8cec18a000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea585000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=24065, ...}) = 0 mmap(NULL, 24065, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8cea57f000 close(3) = 0 open("/lib64/libmount.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\227\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=246160, ...}) = 0 mmap(NULL, 2345408, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8cea12b000 mprotect(0x7f8cea166000, 2093056, PROT_NONE) = 0 mmap(0x7f8cea365000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3a000) = 0x7f8cea365000 mmap(0x7f8cea367000, 2496, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8cea367000 close(3) = 0 open("/lib64/libblkid.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p~\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=236656, ...}) = 0 mmap(NULL, 2335912, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ce9ef0000 mprotect(0x7f8ce9f26000, 2097152, PROT_NONE) = 0 mmap(0x7f8cea126000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x36000) = 0x7f8cea126000 mmap(0x7f8cea12a000, 1192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8cea12a000 close(3) = 0 open("/lib64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\24\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=15648, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea57e000 mmap(NULL, 2110664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ce9cec000 mprotect(0x7f8ce9cf0000, 2093056, PROT_NONE) = 0 mmap(0x7f8ce9eef000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f8ce9eef000 close(3) = 0 open("/usr/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340^\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=126288, ...}) = 0 mmap(NULL, 2230272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ce9acb000 mprotect(0x7f8ce9ae9000, 2093056, PROT_NONE) = 0 mmap(0x7f8ce9ce8000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d000) = 0x7f8ce9ce8000 mmap(0x7f8ce9cea000, 6144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8ce9cea000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\30\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2000552, ...}) = 0 mmap(NULL, 3820128, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ce9726000 mprotect(0x7f8ce98c1000, 2097152, PROT_NONE) = 0 mmap(0x7f8ce9ac1000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19b000) = 0x7f8ce9ac1000 mmap(0x7f8ce9ac7000, 14944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8ce9ac7000 close(3) = 0 open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\16\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=19512, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea57d000 mmap(NULL, 2109712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ce9522000 mprotect(0x7f8ce9525000, 2093056, PROT_NONE) = 0 mmap(0x7f8ce9724000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f8ce9724000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea57c000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea57a000 arch_prctl(ARCH_SET_FS, 0x7f8cea57a800) = 0 mprotect(0x7f8ce9ac1000, 16384, PROT_READ) = 0 mprotect(0x7f8ce9724000, 4096, PROT_READ) = 0 mprotect(0x7f8ce9ce8000, 4096, PROT_READ) = 0 mprotect(0x7f8cea792000, 4096, PROT_READ) = 0 mprotect(0x7f8cea586000, 4096, PROT_READ) = 0 munmap(0x7f8cea57f000, 24065) = 0 statfs("/sys/fs/selinux", 0x7fffbde539a0) = -1 ENOENT (No such file or directory) statfs("/selinux", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=5127322, f_bfree=4679004, f_bavail=4653942, f_files=1310720, f_ffree=1256024, f_fsid={1741214550, 1681826442}, f_namelen=255, f_frsize=4096}) = 0 brk(0) = 0x7f8cec18a000 brk(0x7f8cec1ab000) = 0x7f8cec1ab000 open("/proc/filesystems", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8cea584000 read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 282 read(3, "", 1024) = 0 close(3) = 0 munmap(0x7f8cea584000, 4096) = 0 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0 mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8ce2ffb000 close(3) = 0 getuid() = 0 geteuid() = 0 getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0 prctl(PR_GET_DUMPABLE) = 1 lstat("/etc/mtab", {st_mode=S_IFLNK|0777, st_size=12, ...}) = 0 getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0 prctl(PR_GET_DUMPABLE) = 1 lstat("/dev/.mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 open("/dev/.mount/utab", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 3 close(3) = 0 getcwd("/root", 4095) = 6 readlink("/root/192.168.0.10:", 0x7fffbde517c0, 4096) = -1 ENOENT (No such file or directory) readlink("/var", 0x7fffbde51720, 4096) = -1 EINVAL (Invalid argument) readlink("/var/www", 0x7fffbde51720, 4096) = -1 EINVAL (Invalid argument) readlink("/var/www/hawaii", 0x7fffbde51720, 4096) = -1 EINVAL (Invalid argument) stat("/sbin/mount.glusterfs", {st_mode=S_IFREG|0755, st_size=16839, ...}) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f8cea57aad0) = 4147 wait4(-1, 0x7fffbde53930, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL, si_value={int=2817813648, ptr=0x7fffa7f46c90}} --- wait4(-1, Broadcast message from root at web1 (unknown) at 18:00 ... The system is going down for power off NOW! Using username "root". Authenticating with public key "rsa-key-20131216" Last login: Mon Nov 10 17:52:53 2014 from 93.107.38.51 __| __|_ ) _| ( / Amazon Linux AMI ___|\___|___| https://aws.amazon.com/amazon-linux-ami/2014.09-release-notes/ [root at web1 ~]# [root at web1 ~]# [root at web1 ~]# strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii [root at web1 ~]# strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii execve("/bin/mount", ["mount", "-t", "glusterfs", "192.168.0.10:/aloha", "/var/www/hawaii"], [/* 34 vars */]) = 0 brk(0) = 0x7fb78987f000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e6d000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=24065, ...}) = 0 mmap(NULL, 24065, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb787e67000 close(3) = 0 open("/lib64/libmount.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\227\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=246160, ...}) = 0 mmap(NULL, 2345408, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb787a13000 mprotect(0x7fb787a4e000, 2093056, PROT_NONE) = 0 mmap(0x7fb787c4d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3a000) = 0x7fb787c4d000 mmap(0x7fb787c4f000, 2496, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb787c4f000 close(3) = 0 open("/lib64/libblkid.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p~\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=236656, ...}) = 0 mmap(NULL, 2335912, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb7877d8000 mprotect(0x7fb78780e000, 2097152, PROT_NONE) = 0 mmap(0x7fb787a0e000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x36000) = 0x7fb787a0e000 mmap(0x7fb787a12000, 1192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb787a12000 close(3) = 0 open("/lib64/libuuid.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\24\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=15648, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e66000 mmap(NULL, 2110664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb7875d4000 mprotect(0x7fb7875d8000, 2093056, PROT_NONE) = 0 mmap(0x7fb7877d7000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7fb7877d7000 close(3) = 0 open("/usr/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340^\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=126288, ...}) = 0 mmap(NULL, 2230272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb7873b3000 mprotect(0x7fb7873d1000, 2093056, PROT_NONE) = 0 mmap(0x7fb7875d0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d000) = 0x7fb7875d0000 mmap(0x7fb7875d2000, 6144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb7875d2000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\30\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2000552, ...}) = 0 mmap(NULL, 3820128, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb78700e000 mprotect(0x7fb7871a9000, 2097152, PROT_NONE) = 0 mmap(0x7fb7873a9000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19b000) = 0x7fb7873a9000 mmap(0x7fb7873af000, 14944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb7873af000 close(3) = 0 open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\16\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=19512, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e65000 mmap(NULL, 2109712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb786e0a000 mprotect(0x7fb786e0d000, 2093056, PROT_NONE) = 0 mmap(0x7fb78700c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fb78700c000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e64000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e62000 arch_prctl(ARCH_SET_FS, 0x7fb787e62800) = 0 mprotect(0x7fb7873a9000, 16384, PROT_READ) = 0 mprotect(0x7fb78700c000, 4096, PROT_READ) = 0 mprotect(0x7fb7875d0000, 4096, PROT_READ) = 0 mprotect(0x7fb78807a000, 4096, PROT_READ) = 0 mprotect(0x7fb787e6e000, 4096, PROT_READ) = 0 munmap(0x7fb787e67000, 24065) = 0 statfs("/sys/fs/selinux", 0x7fff427cdb20) = -1 ENOENT (No such file or directory) statfs("/selinux", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=5127322, f_bfree=4679520, f_bavail=4654458, f_files=1310720, f_ffree=1256031, f_fsid={1741214550, 1681826442}, f_namelen=255, f_frsize=4096}) = 0 brk(0) = 0x7fb78987f000 brk(0x7fb7898a0000) = 0x7fb7898a0000 open("/proc/filesystems", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb787e6c000 read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 282 read(3, "", 1024) = 0 close(3) = 0 munmap(0x7fb787e6c000, 4096) = 0 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0 mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb7808e3000 close(3) = 0 getuid() = 0 geteuid() = 0 getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0 prctl(PR_GET_DUMPABLE) = 1 lstat("/etc/mtab", {st_mode=S_IFLNK|0777, st_size=12, ...}) = 0 getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0 prctl(PR_GET_DUMPABLE) = 1 lstat("/dev/.mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 open("/dev/.mount/utab", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 3 close(3) = 0 getcwd("/root", 4095) = 6 readlink("/root/192.168.0.10:", 0x7fff427cb940, 4096) = -1 ENOENT (No such file or directory) readlink("/var", 0x7fff427cb8a0, 4096) = -1 EINVAL (Invalid argument) readlink("/var/www", 0x7fff427cb8a0, 4096) = -1 EINVAL (Invalid argument) readlink("/var/www/hawaii", 0x7fff427cb8a0, 4096) = -1 EINVAL (Invalid argument) stat("/sbin/mount.glusterfs", {st_mode=S_IFREG|0755, st_size=16839, ...}) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb787e62ad0) = 3186 wait4(-1, Mount failed. Please check the log file for more details. [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 3186 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3186, si_status=1, si_utime=0, si_stime=0} --- close(1) = 0 close(2) = 0 exit_group(1) = ? +++ exited with 1 +++ -------------- next part -------------- ==> /var/log/glusterfs/var-www-hawaii.log <=[2014-11-10 18:11:43.694660] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.1 (args: /usr/sbin/glusterfs --volfile-server=192.168.0.10 --volfile-id=/aloha /var/www/hawaii) [2014-11-10 18:11:43.701862] I [dht-shared.c:337:dht_init_regex] 0-aloha-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$ [2014-11-10 18:11:43.704415] I [client.c:2280:notify] 0-aloha-client-0: parent translators are ready, attempting connect on transport [2014-11-10 18:11:43.706575] I [client.c:2280:notify] 0-aloha-client-1: parent translators are ready, attempting connect on transport Final graph: +------------------------------------------------------------------------------+ 1: volume aloha-client-0 2: type protocol/client 3: option ping-timeout 5 4: option remote-host 192.168.0.10 5: option remote-subvolume /var/aloha 6: option transport-type socket 7: option username 18a8f81e-aba5-4948-9983-4791f74ce8aa 8: option password 0795e117-643c-46f9-ae5c-fb74e1fe40ab 9: option send-gids true 10: end-volume 11: 12: volume aloha-client-1 13: type protocol/client 14: option ping-timeout 5 15: option remote-host 192.168.2.10 16: option remote-subvolume /var/aloha 17: option transport-type socket 18: option username 18a8f81e-aba5-4948-9983-4791f74ce8aa 19: option password 0795e117-643c-46f9-ae5c-fb74e1fe40ab 20: option send-gids true 21: end-volume 22: 23: volume aloha-replicate-0 24: type cluster/replicate 25: subvolumes aloha-client-0 aloha-client-1 26: end-volume 27: 28: volume aloha-dht 29: type cluster/distribute 30: subvolumes aloha-replicate-0 31: end-volume 32: 33: volume aloha-write-behind 34: type performance/write-behind 35: subvolumes aloha-dht 36: end-volume 37: 38: volume aloha-read-ahead 39: type performance/read-ahead 40: subvolumes aloha-write-behind 41: end-volume 42: 43: volume aloha-io-cache 44: type performance/io-cache 45: subvolumes aloha-read-ahead 46: end-volume 47: 48: volume aloha-quick-read 49: type performance/quick-read 50: subvolumes aloha-io-cache 51: end-volume 52: 53: volume aloha-open-behind 54: type performance/open-behind 55: subvolumes aloha-quick-read 56: end-volume 57: 58: volume aloha-md-cache 59: type performance/md-cache 60: subvolumes aloha-open-behind 61: end-volume 62: 63: volume aloha 64: type debug/io-stats 65: option latency-measurement off 66: option count-fop-hits off 67: subvolumes aloha-md-cache 68: end-volume 69: 70: volume meta-autoload 71: type meta 72: subvolumes aloha 73: end-volume 74: +------------------------------------------------------------------------------+ [2014-11-10 18:11:43.709691] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-aloha-client-0: changing port to 49155 (from 0) [2014-11-10 18:11:43.712281] I [client-handshake.c:1415:select_server_supported_programs] 0-aloha-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2014-11-10 18:11:43.712506] I [client-handshake.c:1200:client_setvolume_cbk] 0-aloha-client-0: Connected to aloha-client-0, attached to remote volume '/var/aloha'. [2014-11-10 18:11:43.712526] I [client-handshake.c:1212:client_setvolume_cbk] 0-aloha-client-0: Server and Client lk-version numbers are not same, reopening the fds [2014-11-10 18:11:43.712592] I [MSGID: 108005] [afr-common.c:3553:afr_notify] 0-aloha-replicate-0: Subvolume 'aloha-client-0' came back up; going online. [2014-11-10 18:11:43.712628] I [client-handshake.c:188:client_set_lk_version_cbk] 0-aloha-client-0: Server lk version = 1 ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 7 times between [2014-11-10 18:11:30.596042] and [2014-11-10 18:11:52.009594] [2014-11-10 18:11:55.010128] I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd. [2014-11-10 18:12:38.297258] I [MSGID: 106004] [glusterd-handler.c:4365:__glusterd_peer_rpc_notify] 0-management: Peer 680aafcc-507b-48cb-b727-6ee472a6ff91, in Peer in Cluster state, has disconnected from glusterd. ==> /var/log/glusterfs/cli.log <=[2014-11-10 18:12:47.576652] D [cli.c:612:cli_rpc_init] 0-cli: Connecting to glusterd using default socket [2014-11-10 18:12:47.576727] D [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glusterfs: defaulting frame-timeout to 30mins [2014-11-10 18:12:47.576745] D [rpc-clnt.c:986:rpc_clnt_connection_init] 0-glusterfs: disable ping-timeout [2014-11-10 18:12:47.576768] D [rpc-transport.c:262:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so [2014-11-10 18:12:47.578346] D [socket.c:3684:socket_init] 0-glusterfs: disabling nodelay [2014-11-10 18:12:47.578373] D [socket.c:3799:socket_init] 0-glusterfs: SSL support on the I/O path is NOT enabled [2014-11-10 18:12:47.578386] D [socket.c:3802:socket_init] 0-glusterfs: SSL support for glusterd is NOT enabled [2014-11-10 18:12:47.578398] D [socket.c:3819:socket_init] 0-glusterfs: using system polling thread [2014-11-10 18:12:47.578422] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2014-11-10 18:12:47.578438] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x10e3580, state=0 gen=0 sock=-1 [2014-11-10 18:12:47.578454] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /var/run/glusterd.socket [2014-11-10 18:12:47.578503] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2014-11-10 18:12:47.578685] D [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glusterfs: defaulting frame-timeout to 30mins [2014-11-10 18:12:47.578704] D [rpc-clnt.c:986:rpc_clnt_connection_init] 0-glusterfs: disable ping-timeout [2014-11-10 18:12:47.578722] D [rpc-transport.c:262:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so [2014-11-10 18:12:47.578765] D [socket.c:3799:socket_init] 0-glusterfs: SSL support on the I/O path is NOT enabled [2014-11-10 18:12:47.578793] D [socket.c:3802:socket_init] 0-glusterfs: SSL support for glusterd is NOT enabled [2014-11-10 18:12:47.578805] D [socket.c:3819:socket_init] 0-glusterfs: using system polling thread [2014-11-10 18:12:47.578817] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2014-11-10 18:12:47.578842] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x10ebd50, state=0 gen=0 sock=-1 [2014-11-10 18:12:47.578856] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /tmp/quotad.socket [2014-11-10 18:12:47.578874] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2014-11-10 18:12:47.578925] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.578955] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.578983] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.578998] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579011] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579036] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579057] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579071] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579100] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579164] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579182] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579198] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579212] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579227] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579241] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579542] I [cli-cmd-volume.c:1778:cli_check_gsync_present] 0-: geo-replication not installed [2014-11-10 18:12:47.579573] D [cli-cmd-volume.c:1799:cli_check_gsync_present] 0-cli: Returning -1 [2014-11-10 18:12:47.579593] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579613] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579640] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579666] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579686] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579704] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579722] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579734] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579750] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579763] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579780] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579794] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579805] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579816] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579837] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579854] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579871] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579885] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579897] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579910] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579923] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579935] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579948] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579960] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579974] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.579989] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580003] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580014] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580024] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580037] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580054] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580068] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580083] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580096] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580111] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580128] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580141] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580154] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580166] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2014-11-10 18:12:47.580226] T [cli.c:264:cli_rpc_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2014-11-10 18:12:47.580258] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2014-11-10 18:12:47.580273] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now [2014-11-10 18:12:47.580292] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT [2014-11-10 18:12:47.580364] T [rpc-clnt.c:1381:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner: [2014-11-10 18:12:47.580403] T [rpc-clnt.c:1238:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 72, payload: 8, rpc hdr: 64 [2014-11-10 18:12:47.580657] T [socket.c:2863:socket_connect] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f4ff80f7420] (--> /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x7293)[0x7f4feeb99293] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x468)[0x7f4ff784cf98] (--> gluster(cli_submit_request+0xdb)[0x40a9bb] (--> gluster(cli_cmd_submit+0x8e)[0x40b7be] ))))) 0-glusterfs: connect () called on transport already connected [2014-11-10 18:12:47.580972] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 3) to rpc-transport (glusterfs) [2014-11-10 18:12:47.580993] D [rpc-clnt-ping.c:231:rpc_clnt_start_ping] 0-glusterfs: ping timeout is 0, returning ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=[2014-11-10 18:12:47.580975] I [glusterd-handler.c:1225:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req ==> /var/log/glusterfs/cli.log <=[2014-11-10 18:12:47.581073] T [rpc-clnt.c:660:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 3) from rpc-transport (glusterfs) [2014-11-10 18:12:47.581098] D [cli-rpc-ops.c:411:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-11-10 18:12:47.581192] D [cli-cmd.c:384:cli_cmd_submit] 0-cli: Returning 0 [2014-11-10 18:12:47.581209] D [cli-rpc-ops.c:3100:gf_cli_list_friends] 0-cli: Returning 0 [2014-11-10 18:12:47.581224] I [input.c:36:cli_batch] 0-: Exiting with: 0 ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <= ==> /var/log/glusterfs/glustershd.log <=[2014-11-10 18:13:37.945294] E [socket.c:2267:socket_connect_finish] 0-aloha-client-1: connection to 192.168.2.10:24007 failed (Connection timed out) ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <= ==> /var/log/glusterfs/var-www-hawaii.log <=[2014-11-10 18:13:51.001224] E [socket.c:2267:socket_connect_finish] 0-aloha-client-1: connection to 192.168.2.10:24007 failed (Connection timed out) [2014-11-10 18:13:51.005235] I [fuse-bridge.c:5080:fuse_graph_setup] 0-fuse: switched to graph 0 [2014-11-10 18:13:51.005408] I [fuse-bridge.c:4009:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2014-11-10 18:13:51.007416] I [afr-common.c:1485:afr_local_discovery_cbk] 0-aloha-replicate-0: selecting local read_child aloha-client-0 ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <=The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2014-11-10 18:11:55.010128] and [2014-11-10 18:13:52.030151] [2014-11-10 18:13:55.030642] I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd.
Pranith Kumar Karampuri
2014-Nov-11 01:19 UTC
[Gluster-users] Mount problems when secondary node down
On 11/10/2014 11:47 PM, A F wrote:> Hello, > > I have two servers, 192.168.0.10 and 192.168.2.10. I'm using gluster > 3.6.1 (installed from gluster repo) on AWS Linux. Both servers are > completely reachable in LAN. > # rpm -qa|grep gluster > glusterfs-3.6.1-1.el6.x86_64 > glusterfs-server-3.6.1-1.el6.x86_64 > glusterfs-libs-3.6.1-1.el6.x86_64 > glusterfs-api-3.6.1-1.el6.x86_64 > glusterfs-cli-3.6.1-1.el6.x86_64 > glusterfs-fuse-3.6.1-1.el6.x86_64 > > These are the commands I ran: > # gluster peer probe 192.168.2.10 > # gluster volume create aloha replica 2 transport tcp > 192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force > # gluster volume start aloha > # gluster volume set aloha network.ping-timeout 5 > # gluster volume set aloha nfs.disable on > > Problem number 1: > tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows log > cluttering with: > [2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv] > 0-management: readv on > /var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid > argument) > this happens every 3 seconds on both servers. It is related to NFS and > probably rpcbind, but I absolutely want them disabled. As you see, > I've set gluster to disable nfs - why doesn't it keep quiet about it > then? > > Problem number 2: > in fstab on server 192.168.0.10: 192.168.0.10:/aloha /var/www/hawaii > glusterfs defaults,_netdev 0 0 > in fstab on server 192.168.2.10: 192.168.2.10:/aloha > /var/www/hawaii glusterfs defaults,_netdev 0 0 > > If I shutdown one of the servers (192.168.2.10), and I reboot the > remaining one (192.168.0.10), it won't come up as fast as it should. > It lags a few minutes waiting for gluster. After it eventually starts, > mount point is not mounted and volume is stopped: > # gluster volume status > Status of volume: aloha > Gluster process Port Online Pid > ------------------------------------------------------------------------------ > > Brick 192.168.0.10:/var/aloha N/A N N/A > Self-heal Daemon on localhost N/A N N/A > > Task Status of Volume aloha > ------------------------------------------------------------------------------ > > There are no active volume tasks > > This didn't happen before, so fine, I first have to stop the volume > and then start it again. It now shows as online: > Brick 192.168.0.10:/var/aloha 49155 Y > 3473 > Self-heal Daemon on localhost N/A Y 3507 > > # time mount -a > real 2m7.307s > > # time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii > real 2m7.365s > > # strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii > (attached) > > # tail /var/log/glusterfs/* -f|grep -v readv > (attached) > > I've done this setup before, so I'm amazed it doesn't work. I even > have it in production at the moment, with the same options and setup, > and for example I'm not getting readv errors. I'm unable to test the > mount part though, but I feel I have covered it way back when I was > testing the environment. > Any help is kindly appreciated.CC glusterd folks Pranith> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141111/b43ae35f/attachment.html>