thr3ads.net - Gluster users - [Gluster-users] concurrent "gluster volume status" crashes the command (v3.4 and v3.7) [Nov 2015]

If this information is useful, please help other people find it:
Share via:

Engelmann Florian

2015-Nov-10 11:57 UTC

[Gluster-users] concurrent "gluster volume status" crashes the command (v3.4 and v3.7)

Dear list,

concurrent running "gluster volume status" on all 3 GlusterFS Nodes
(actually those are LXC) somehow crashes the command. Two nodes reply
"Another transaction is in progress. Please try again after sometime."
and on the 3rd node the command hangs forever. Stopping the hanging command and
running it again results also in "Another transaction is in progress.
Please try again after sometime." on that machine.

strace exits like:

[...]
connect(7, {sa_family=AF_LOCAL,
sun_path="/var/run/gluster/quotad.socket"}, 110) = -1 ENOENT (No such
file or directory)
fcntl(7, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLONESHOT, {u32=1,
u64=4294967297}}) = 0
pipe([8, 9])                            = 0
fcntl(9, F_SETFD, FD_CLOEXEC)           = 0
pipe([10, 11])                          = 0
fcntl(10, F_GETFL)                      = 0 (flags O_RDONLY)
fstat(10, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x7f67780e5000
lseek(10, 0, SEEK_CUR)                  = -1 ESPIPE (Illegal seek)
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f67780d9a50) = 28493
close(-1)                               = -1 EBADF (Bad file descriptor)
close(11)                               = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(9)                                = 0
read(8, "", 4)                          = 0
close(8)                                = 0
read(10, "gsyncd.py 0.0.1\n", 4096)     = 16
wait4(28493, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 28493
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28493, si_status=0,
si_utime=5, si_stime=1} ---
close(10)                               = 0
munmap(0x7f67780e5000, 4096)            = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(-2)                               = -1 EBADF (Bad file descriptor)
close(-1)                               = -1 EBADF (Bad file descriptor)
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
-1, 0) = 0x7f6773545000
mprotect(0x7f6773545000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773d44f70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f6773d459d0, tls=0x7f6773d45700, child_tidptr=0x7f6773d459d0) =
28496
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
-1, 0) = 0x7f6772d44000
mprotect(0x7f6772d44000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773543f70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f67735449d0, tls=0x7f6773544700, child_tidptr=0x7f67735449d0) =
28497
futex(0x7f67735449d0, FUTEX_WAIT, 28497, NULLAnother transaction is in progress.
Please try again after sometime.
 <unfinished ...>
+++ exited with 1 +++

I  had to stop all volumes and restart glusterd to solve that problem.

Host OS: Ubuntu 14.04 LTS
LXC OS:  Ubuntu 14.04 LTS


We've got this issue with 3.4.2 (Ubuntu official) and upgraded to 3.7.5
(Launchpad) to check if the problem still exists. Still unsolved. Any ideas?

Thank you for your help,
Florian

Gluster users - Nov 2015 - concurrent "gluster volume status" crashes the command (v3.4 and v3.7)

[Gluster-users] concurrent "gluster volume status" crashes the command (v3.4 and v3.7)