Niels de Vos
2015-Dec-22 16:38 UTC
[Gluster-users] [Gluster-devel] glusterfsd crash due to page allocation failure
On Mon, Dec 21, 2015 at 03:55:08PM -0500, Glomski, Patrick wrote:> Hello, > > We've recently upgraded from gluster 3.6.6 to 3.7.6 and have started > encountering dmesg page allocation errors (stack trace is appended). > > It appears that glusterfsd now sometimes fills up the cache completely and > crashes with a page allocation failure. I *believe* it mainly happens when > copying lots of new data to the system, running a 'find', or similar. Hosts > are all Scientific Linux 6.6 and these errors occur consistently on two > separate gluster pools. > > Has anyone else seen this issue and are there any known fixes for it via > sysctl kernel parameters or other means? > > Please let me know of any other diagnostic information that would help.Could you explain a little more about this? The below is a message from the kernel telling you that the mlx4_ib (Mellanox Infiniband?) driver is requesting more continuous memory than is immediately available. So, the questions I have regarding this: 1. how is infiniband involved/configured in this environment? 2. was there a change/update of the driver (kernel update maybe?) 3. do you get a coredump of the glusterfsd process when this happens? 4. is this a fuse mount process, or a brick process? (check by PID?) Thanks, Niels> > Thanks, > Patrick > > > [1458118.134697] glusterfsd: page allocation failure. order:5, mode:0x20 > > [1458118.134701] Pid: 6010, comm: glusterfsd Not tainted > > 2.6.32-573.3.1.el6.x86_64 #1 > > [1458118.134702] Call Trace: > > [1458118.134714] [<ffffffff8113770c>] ? __alloc_pages_nodemask+0x7dc/0x950 > > [1458118.134728] [<ffffffffa0321800>] ? mlx4_ib_post_send+0x680/0x1f90 > > [mlx4_ib] > > [1458118.134733] [<ffffffff81176e92>] ? kmem_getpages+0x62/0x170 > > [1458118.134735] [<ffffffff81177aaa>] ? fallback_alloc+0x1ba/0x270 > > [1458118.134736] [<ffffffff811774ff>] ? cache_grow+0x2cf/0x320 > > [1458118.134738] [<ffffffff81177829>] ? ____cache_alloc_node+0x99/0x160 > > [1458118.134743] [<ffffffff8145f732>] ? pskb_expand_head+0x62/0x280 > > [1458118.134744] [<ffffffff81178479>] ? __kmalloc+0x199/0x230 > > [1458118.134746] [<ffffffff8145f732>] ? pskb_expand_head+0x62/0x280 > > [1458118.134748] [<ffffffff8146001a>] ? __pskb_pull_tail+0x2aa/0x360 > > [1458118.134751] [<ffffffff8146f389>] ? harmonize_features+0x29/0x70 > > [1458118.134753] [<ffffffff8146f9f4>] ? dev_hard_start_xmit+0x1c4/0x490 > > [1458118.134758] [<ffffffff8148cf8a>] ? sch_direct_xmit+0x15a/0x1c0 > > [1458118.134759] [<ffffffff8146ff68>] ? dev_queue_xmit+0x228/0x320 > > [1458118.134762] [<ffffffff8147665d>] ? neigh_connected_output+0xbd/0x100 > > [1458118.134766] [<ffffffff814abc67>] ? ip_finish_output+0x287/0x360 > > [1458118.134767] [<ffffffff814abdf8>] ? ip_output+0xb8/0xc0 > > [1458118.134769] [<ffffffff814ab04f>] ? __ip_local_out+0x9f/0xb0 > > [1458118.134770] [<ffffffff814ab085>] ? ip_local_out+0x25/0x30 > > [1458118.134772] [<ffffffff814ab580>] ? ip_queue_xmit+0x190/0x420 > > [1458118.134773] [<ffffffff81137059>] ? __alloc_pages_nodemask+0x129/0x950 > > [1458118.134776] [<ffffffff814c0c54>] ? tcp_transmit_skb+0x4b4/0x8b0 > > [1458118.134778] [<ffffffff814c319a>] ? tcp_write_xmit+0x1da/0xa90 > > [1458118.134779] [<ffffffff81178cbd>] ? __kmalloc_node+0x4d/0x60 > > [1458118.134780] [<ffffffff814c3a80>] ? tcp_push_one+0x30/0x40 > > [1458118.134782] [<ffffffff814b410c>] ? tcp_sendmsg+0x9cc/0xa20 > > [1458118.134786] [<ffffffff8145836b>] ? sock_aio_write+0x19b/0x1c0 > > [1458118.134788] [<ffffffff814581d0>] ? sock_aio_write+0x0/0x1c0 > > [1458118.134791] [<ffffffff8119169b>] ? do_sync_readv_writev+0xfb/0x140 > > [1458118.134797] [<ffffffff810a14b0>] ? autoremove_wake_function+0x0/0x40 > > [1458118.134801] [<ffffffff8123e92f>] ? selinux_file_permission+0xbf/0x150 > > [1458118.134804] [<ffffffff812316d6>] ? security_file_permission+0x16/0x20 > > [1458118.134806] [<ffffffff81192746>] ? do_readv_writev+0xd6/0x1f0 > > [1458118.134807] [<ffffffff811928a6>] ? vfs_writev+0x46/0x60 > > [1458118.134809] [<ffffffff811929d1>] ? sys_writev+0x51/0xd0 > > [1458118.134812] [<ffffffff810e88ae>] ? __audit_syscall_exit+0x25e/0x290 > > [1458118.134816] [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b > >> _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151222/9b7f17dd/attachment.sig>
David Robinson
2015-Dec-22 17:15 UTC
[Gluster-users] [Gluster-devel] glusterfsd crash due to page allocation failure
Niels, > 1. how is infiniband involved/configured in this environment? gfsib01bkp and gfs02bkp are connected via infiniband. We are using tcp transport as I never was able to get RDMA to work. Volume Name: gfsbackup Type: Distribute Volume ID: e78d5123-d9bc-4d88-9c73-61d28abf0b41 Status: Started Number of Bricks: 7 Transport-type: tcp Bricks: Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/gfsbackup Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/gfsbackup Brick3: gfsib02bkp.corvidtec.com:/data/brick01bkp/gfsbackup Brick4: gfsib02bkp.corvidtec.com:/data/brick02bkp/gfsbackup Brick5: gfsib02bkp.corvidtec.com:/data/brick03bkp/gfsbackup Brick6: gfsib02bkp.corvidtec.com:/data/brick04bkp/gfsbackup Brick7: gfsib02bkp.corvidtec.com:/data/brick05bkp/gfsbackup > 2. was there a change/update of the driver (kernel update maybe?) Before upgrading these servers from gluster 3.6.6 to 3.7.6, I did a 'yum update' which did upgrade the kernel. Current kernel is 2.6.32-573.12.1.el6.x86_64 > 3. do you get a coredump of the glusterfsd process when this happens? There are a series of core files in / around the same time that this happens. -rw------- 1 root root 168865792 Dec 22 10:45 core.3700 -rw------- 1 root root 168861696 Dec 22 10:45 core.3661 -rw------- 1 root root 168861696 Dec 22 10:45 core.3706 -rw------- 1 root root 168861696 Dec 22 10:45 core.3677 -rw------- 1 root root 168861696 Dec 22 10:45 core.3669 -rw------- 1 root root 168857600 Dec 22 10:45 core.3654 -rw------- 1 root root 254345216 Dec 22 10:45 core.3693 -rw------- 1 root root 254341120 Dec 22 10:45 core.3685 > 4. is this a fuse mount process, or a brick process? (check by PID?) I have rebooted the machine as it was in a bad state and I could no longer write to the gluster volume. When it does it again, I will check the PID. This machine has both brick processses and fuse mounts. The storage servers mount the volume through a fuse mount and then I use rsync to backup my primary storage system. David>> Hello, >> >> We've recently upgraded from gluster 3.6.6 to 3.7.6 and have started >> encountering dmesg page allocation errors (stack trace is appended). >> >> It appears that glusterfsd now sometimes fills up the cache >>completely and >> crashes with a page allocation failure. I *believe* it mainly happens >>when >> copying lots of new data to the system, running a 'find', or similar. >>Hosts >> are all Scientific Linux 6.6 and these errors occur consistently on >>two >> separate gluster pools. >> >> Has anyone else seen this issue and are there any known fixes for it >>via >> sysctl kernel parameters or other means? >> >> Please let me know of any other diagnostic information that would >>help. > >Could you explain a little more about this? The below is a message from >the kernel telling you that the mlx4_ib (Mellanox Infiniband?) driver >is >requesting more continuous memory than is immediately available. > >So, the questions I have regarding this: > >1. how is infiniband involved/configured in this environment? >2. was there a change/update of the driver (kernel update maybe?) >3. do you get a coredump of the glusterfsd process when this happens? >4. is this a fuse mount process, or a brick process? (check by PID?) > >Thanks, >Niels > > >> >> Thanks, >> Patrick >> >> >> [1458118.134697] glusterfsd: page allocation failure. order:5, >>mode:0x20 >> > [1458118.134701] Pid: 6010, comm: glusterfsd Not tainted >> > 2.6.32-573.3.1.el6.x86_64 #1 >> > [1458118.134702] Call Trace: >> > [1458118.134714] [<ffffffff8113770c>] ? >>__alloc_pages_nodemask+0x7dc/0x950 >> > [1458118.134728] [<ffffffffa0321800>] ? >>mlx4_ib_post_send+0x680/0x1f90 >> > [mlx4_ib] >> > [1458118.134733] [<ffffffff81176e92>] ? kmem_getpages+0x62/0x170 >> > [1458118.134735] [<ffffffff81177aaa>] ? fallback_alloc+0x1ba/0x270 >> > [1458118.134736] [<ffffffff811774ff>] ? cache_grow+0x2cf/0x320 >> > [1458118.134738] [<ffffffff81177829>] ? >>____cache_alloc_node+0x99/0x160 >> > [1458118.134743] [<ffffffff8145f732>] ? >>pskb_expand_head+0x62/0x280 >> > [1458118.134744] [<ffffffff81178479>] ? __kmalloc+0x199/0x230 >> > [1458118.134746] [<ffffffff8145f732>] ? >>pskb_expand_head+0x62/0x280 >> > [1458118.134748] [<ffffffff8146001a>] ? >>__pskb_pull_tail+0x2aa/0x360 >> > [1458118.134751] [<ffffffff8146f389>] ? >>harmonize_features+0x29/0x70 >> > [1458118.134753] [<ffffffff8146f9f4>] ? >>dev_hard_start_xmit+0x1c4/0x490 >> > [1458118.134758] [<ffffffff8148cf8a>] ? >>sch_direct_xmit+0x15a/0x1c0 >> > [1458118.134759] [<ffffffff8146ff68>] ? dev_queue_xmit+0x228/0x320 >> > [1458118.134762] [<ffffffff8147665d>] ? >>neigh_connected_output+0xbd/0x100 >> > [1458118.134766] [<ffffffff814abc67>] ? >>ip_finish_output+0x287/0x360 >> > [1458118.134767] [<ffffffff814abdf8>] ? ip_output+0xb8/0xc0 >> > [1458118.134769] [<ffffffff814ab04f>] ? __ip_local_out+0x9f/0xb0 >> > [1458118.134770] [<ffffffff814ab085>] ? ip_local_out+0x25/0x30 >> > [1458118.134772] [<ffffffff814ab580>] ? ip_queue_xmit+0x190/0x420 >> > [1458118.134773] [<ffffffff81137059>] ? >>__alloc_pages_nodemask+0x129/0x950 >> > [1458118.134776] [<ffffffff814c0c54>] ? >>tcp_transmit_skb+0x4b4/0x8b0 >> > [1458118.134778] [<ffffffff814c319a>] ? tcp_write_xmit+0x1da/0xa90 >> > [1458118.134779] [<ffffffff81178cbd>] ? __kmalloc_node+0x4d/0x60 >> > [1458118.134780] [<ffffffff814c3a80>] ? tcp_push_one+0x30/0x40 >> > [1458118.134782] [<ffffffff814b410c>] ? tcp_sendmsg+0x9cc/0xa20 >> > [1458118.134786] [<ffffffff8145836b>] ? sock_aio_write+0x19b/0x1c0 >> > [1458118.134788] [<ffffffff814581d0>] ? sock_aio_write+0x0/0x1c0 >> > [1458118.134791] [<ffffffff8119169b>] ? >>do_sync_readv_writev+0xfb/0x140 >> > [1458118.134797] [<ffffffff810a14b0>] ? >>autoremove_wake_function+0x0/0x40 >> > [1458118.134801] [<ffffffff8123e92f>] ? >>selinux_file_permission+0xbf/0x150 >> > [1458118.134804] [<ffffffff812316d6>] ? >>security_file_permission+0x16/0x20 >> > [1458118.134806] [<ffffffff81192746>] ? do_readv_writev+0xd6/0x1f0 >> > [1458118.134807] [<ffffffff811928a6>] ? vfs_writev+0x46/0x60 >> > [1458118.134809] [<ffffffff811929d1>] ? sys_writev+0x51/0xd0 >> > [1458118.134812] [<ffffffff810e88ae>] ? >>__audit_syscall_exit+0x25e/0x290 >> > [1458118.134816] [<ffffffff8100b0d2>] ? >>system_call_fastpath+0x16/0x1b >> > > >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel >