Ankireddypalle Reddy
2017-Jan-10 10:39 UTC
[Gluster-users] Lot of EIO errors in disperse volume
Hi, We upgraded to GlusterFS 3.7.18 yesterday. We see lot of failures in our applications. Most of the errors are EIO. The following log lines are commonly seen in the logs: The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-4: Mismatching xdata in answers of 'LOOKUP'" repeated 2 times between [2017-01-10 02:46:25.069809] and [2017-01-10 02:46:25.069835] [2017-01-10 02:46:25.069852] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP' The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP'" repeated 2 times between [2017-01-10 02:46:25.069852] and [2017-01-10 02:46:25.069873] [2017-01-10 02:46:25.069910] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-6: Mismatching xdata in answers of 'LOOKUP' ... [2017-01-10 02:46:26.520774] I [MSGID: 109036] [dht-common.c:9076:dht_log_new_layout_for_dir_selfheal] 0-StoragePool-dht: Setting layout of /Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854213/CHUNK_51334585 with [Subvol_name: StoragePool-disperse-0, Err: -1 , Start: 3221225466 , Stop: 3758096376 , Hash: 1 ], [Subvol_name: StoragePool-disperse-1, Err: -1 , Start: 3758096377 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: StoragePool-disperse-2, Err: -1 , Start: 0 , Stop: 536870910 , Hash: 1 ], [Subvol_name: StoragePool-disperse-3, Err: -1 , Start: 536870911 , Stop: 1073741821 , Hash: 1 ], [Subvol_name: StoragePool-disperse-4, Err: -1 , Start: 1073741822 , Stop: 1610612732 , Hash: 1 ], [Subvol_name: StoragePool-disperse-5, Err: -1 , Start: 1610612733 , Stop: 2147483643 , Hash: 1 ], [Subvol_name: StoragePool-disperse-6, Err: -1 , Start: 2147483644 , Stop: 2684354554 , Hash: 1 ], [Subvol_name: StoragePool-disperse-7, Err: -1 , Start: 2684354555 , Stop: 3221225465 , Hash: 1 ], [2017-01-10 02:46:26.522841] N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3: Mismatching dictionary in answers of 'GF_FOP_XATTROP' The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3: Mismatching dictionary in answers of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.522841] and [2017-01-10 02:46:26.522894] [2017-01-10 02:46:26.522898] W [MSGID: 122040] [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-3: Failed to get size and version [Input/output error] [2017-01-10 02:46:26.523115] N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6: Mismatching dictionary in answers of 'GF_FOP_XATTROP' The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6: Mismatching dictionary in answers of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.523115] and [2017-01-10 02:46:26.523143] [2017-01-10 02:46:26.523147] W [MSGID: 122040] [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-6: Failed to get size and version [Input/output error] [2017-01-10 02:46:26.523302] N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2: Mismatching dictionary in answers of 'GF_FOP_XATTROP' The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2: Mismatching dictionary in answers of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.523302] and [2017-01-10 02:46:26.523324] [2017-01-10 02:46:26.523328] W [MSGID: 122040] [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-2: Failed to get size and version [Input/output error] [root at glusterfs3 Log_Files]# gluster --version glusterfs 3.7.18 built on Dec 8 2016 06:34:26 [root at glusterfs3 Log_Files]# gluster volume info Volume Name: StoragePool Type: Distributed-Disperse Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f Status: Started Number of Bricks: 8 x (2 + 1) = 24 Transport-type: tcp Bricks: Brick1: glusterfs1sds:/ws/disk1/ws_brick Brick2: glusterfs2sds:/ws/disk1/ws_brick Brick3: glusterfs3sds:/ws/disk1/ws_brick Brick4: glusterfs1sds:/ws/disk2/ws_brick Brick5: glusterfs2sds:/ws/disk2/ws_brick Brick6: glusterfs3sds:/ws/disk2/ws_brick Brick7: glusterfs1sds:/ws/disk3/ws_brick Brick8: glusterfs2sds:/ws/disk3/ws_brick Brick9: glusterfs3sds:/ws/disk3/ws_brick Brick10: glusterfs1sds:/ws/disk4/ws_brick Brick11: glusterfs2sds:/ws/disk4/ws_brick Brick12: glusterfs3sds:/ws/disk4/ws_brick Brick13: glusterfs1sds:/ws/disk5/ws_brick Brick14: glusterfs2sds:/ws/disk5/ws_brick Brick15: glusterfs3sds:/ws/disk5/ws_brick Brick16: glusterfs1sds:/ws/disk6/ws_brick Brick17: glusterfs2sds:/ws/disk6/ws_brick Brick18: glusterfs3sds:/ws/disk6/ws_brick Brick19: glusterfs1sds:/ws/disk7/ws_brick Brick20: glusterfs2sds:/ws/disk7/ws_brick Brick21: glusterfs3sds:/ws/disk7/ws_brick Brick22: glusterfs1sds:/ws/disk8/ws_brick Brick23: glusterfs2sds:/ws/disk8/ws_brick Brick24: glusterfs3sds:/ws/disk8/ws_brick Options Reconfigured: performance.readdir-ahead: on diagnostics.client-log-level: INFO Thanks and Regards, Ram ***************************Legal Disclaimer*************************** "This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you." ********************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20170110/982d939e/attachment.html>
Xavier Hernandez
2017-Jan-10 11:35 UTC
[Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume
Hi Ram, how did you upgrade gluster ? from which version ? Did you upgrade one server at a time and waited until self-heal finished before upgrading the next server ? Xavi On 10/01/17 11:39, Ankireddypalle Reddy wrote:> Hi, > > We upgraded to GlusterFS 3.7.18 yesterday. We see lot of failures > in our applications. Most of the errors are EIO. The following log lines > are commonly seen in the logs: > > > > The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] > 0-StoragePool-disperse-4: Mismatching xdata in answers of 'LOOKUP'" > repeated 2 times between [2017-01-10 02:46:25.069809] and [2017-01-10 > 02:46:25.069835] > > [2017-01-10 02:46:25.069852] W [MSGID: 122056] > [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5: > Mismatching xdata in answers of 'LOOKUP' > > The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] > 0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP'" > repeated 2 times between [2017-01-10 02:46:25.069852] and [2017-01-10 > 02:46:25.069873] > > [2017-01-10 02:46:25.069910] W [MSGID: 122056] > [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-6: > Mismatching xdata in answers of 'LOOKUP' > > ? > > [2017-01-10 02:46:26.520774] I [MSGID: 109036] > [dht-common.c:9076:dht_log_new_layout_for_dir_selfheal] > 0-StoragePool-dht: Setting layout of > /Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854213/CHUNK_51334585 with > [Subvol_name: StoragePool-disperse-0, Err: -1 , Start: 3221225466 , > Stop: 3758096376 , Hash: 1 ], [Subvol_name: StoragePool-disperse-1, Err: > -1 , Start: 3758096377 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-2, Err: -1 , Start: 0 , Stop: 536870910 , Hash: 1 > ], [Subvol_name: StoragePool-disperse-3, Err: -1 , Start: 536870911 , > Stop: 1073741821 , Hash: 1 ], [Subvol_name: StoragePool-disperse-4, Err: > -1 , Start: 1073741822 , Stop: 1610612732 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-5, Err: -1 , Start: 1610612733 , Stop: 2147483643 , > Hash: 1 ], [Subvol_name: StoragePool-disperse-6, Err: -1 , Start: > 2147483644 , Stop: 2684354554 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-7, Err: -1 , Start: 2684354555 , Stop: 3221225465 , > Hash: 1 ], > > [2017-01-10 02:46:26.522841] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-3: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.522841] > and [2017-01-10 02:46:26.522894] > > [2017-01-10 02:46:26.522898] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-3: Failed > to get size and version [Input/output error] > > [2017-01-10 02:46:26.523115] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-6: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.523115] > and [2017-01-10 02:46:26.523143] > > [2017-01-10 02:46:26.523147] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-6: Failed > to get size and version [Input/output error] > > [2017-01-10 02:46:26.523302] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-2: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 02:46:26.523302] > and [2017-01-10 02:46:26.523324] > > [2017-01-10 02:46:26.523328] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-2: Failed > to get size and version [Input/output error] > > > > [root at glusterfs3 Log_Files]# gluster --version > > glusterfs 3.7.18 built on Dec 8 2016 06:34:26 > > > > [root at glusterfs3 Log_Files]# gluster volume info > > > > Volume Name: StoragePool > > Type: Distributed-Disperse > > Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f > > Status: Started > > Number of Bricks: 8 x (2 + 1) = 24 > > Transport-type: tcp > > Bricks: > > Brick1: glusterfs1sds:/ws/disk1/ws_brick > > Brick2: glusterfs2sds:/ws/disk1/ws_brick > > Brick3: glusterfs3sds:/ws/disk1/ws_brick > > Brick4: glusterfs1sds:/ws/disk2/ws_brick > > Brick5: glusterfs2sds:/ws/disk2/ws_brick > > Brick6: glusterfs3sds:/ws/disk2/ws_brick > > Brick7: glusterfs1sds:/ws/disk3/ws_brick > > Brick8: glusterfs2sds:/ws/disk3/ws_brick > > Brick9: glusterfs3sds:/ws/disk3/ws_brick > > Brick10: glusterfs1sds:/ws/disk4/ws_brick > > Brick11: glusterfs2sds:/ws/disk4/ws_brick > > Brick12: glusterfs3sds:/ws/disk4/ws_brick > > Brick13: glusterfs1sds:/ws/disk5/ws_brick > > Brick14: glusterfs2sds:/ws/disk5/ws_brick > > Brick15: glusterfs3sds:/ws/disk5/ws_brick > > Brick16: glusterfs1sds:/ws/disk6/ws_brick > > Brick17: glusterfs2sds:/ws/disk6/ws_brick > > Brick18: glusterfs3sds:/ws/disk6/ws_brick > > Brick19: glusterfs1sds:/ws/disk7/ws_brick > > Brick20: glusterfs2sds:/ws/disk7/ws_brick > > Brick21: glusterfs3sds:/ws/disk7/ws_brick > > Brick22: glusterfs1sds:/ws/disk8/ws_brick > > Brick23: glusterfs2sds:/ws/disk8/ws_brick > > Brick24: glusterfs3sds:/ws/disk8/ws_brick > > Options Reconfigured: > > performance.readdir-ahead: on > > diagnostics.client-log-level: INFO > > > > Thanks and Regards, > > Ram > > ***************************Legal Disclaimer*************************** > "This communication may contain confidential and privileged material for the > sole use of the intended recipient. Any unauthorized review, use or > distribution > by others is strictly prohibited. If you have received the message by > mistake, > please advise the sender by reply email and delete the message. Thank you." > ********************************************************************** > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >
Ankireddypalle Reddy
2017-Jan-10 11:41 UTC
[Gluster-users] [Gluster-devel] Lot of EIO errors in disperse volume
Xavi, We have been running 3.7.8 on these servers. We upgraded to 3.7.18 yesterday. We upgraded all the servers at a time. The volume was brought down during upgrade. Thanks and Regards, Ram -----Original Message----- From: Xavier Hernandez [mailto:xhernandez at datalab.es] Sent: Tuesday, January 10, 2017 6:35 AM To: Ankireddypalle Reddy; Gluster Devel (gluster-devel at gluster.org); gluster-users at gluster.org Subject: Re: [Gluster-devel] Lot of EIO errors in disperse volume Hi Ram, how did you upgrade gluster ? from which version ? Did you upgrade one server at a time and waited until self-heal finished before upgrading the next server ? Xavi On 10/01/17 11:39, Ankireddypalle Reddy wrote:> Hi, > > We upgraded to GlusterFS 3.7.18 yesterday. We see lot of > failures in our applications. Most of the errors are EIO. The > following log lines are commonly seen in the logs: > > > > The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] > 0-StoragePool-disperse-4: Mismatching xdata in answers of 'LOOKUP'" > repeated 2 times between [2017-01-10 02:46:25.069809] and [2017-01-10 > 02:46:25.069835] > > [2017-01-10 02:46:25.069852] W [MSGID: 122056] > [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5: > Mismatching xdata in answers of 'LOOKUP' > > The message "W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] > 0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP'" > repeated 2 times between [2017-01-10 02:46:25.069852] and [2017-01-10 > 02:46:25.069873] > > [2017-01-10 02:46:25.069910] W [MSGID: 122056] > [ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-6: > Mismatching xdata in answers of 'LOOKUP' > > ... > > [2017-01-10 02:46:26.520774] I [MSGID: 109036] > [dht-common.c:9076:dht_log_new_layout_for_dir_selfheal] > 0-StoragePool-dht: Setting layout of > /Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854213/CHUNK_51334585 with > [Subvol_name: StoragePool-disperse-0, Err: -1 , Start: 3221225466 , > Stop: 3758096376 , Hash: 1 ], [Subvol_name: StoragePool-disperse-1, Err: > -1 , Start: 3758096377 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-2, Err: -1 , Start: 0 , Stop: 536870910 , Hash: 1 > ], [Subvol_name: StoragePool-disperse-3, Err: -1 , Start: 536870911 , > Stop: 1073741821 , Hash: 1 ], [Subvol_name: StoragePool-disperse-4, Err: > -1 , Start: 1073741822 , Stop: 1610612732 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-5, Err: -1 , Start: 1610612733 , Stop: 2147483643 > , > Hash: 1 ], [Subvol_name: StoragePool-disperse-6, Err: -1 , Start: > 2147483644 , Stop: 2684354554 , Hash: 1 ], [Subvol_name: > StoragePool-disperse-7, Err: -1 , Start: 2684354555 , Stop: 3221225465 > , > Hash: 1 ], > > [2017-01-10 02:46:26.522841] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-3: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 > 02:46:26.522841] and [2017-01-10 02:46:26.522894] > > [2017-01-10 02:46:26.522898] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-3: > Failed to get size and version [Input/output error] > > [2017-01-10 02:46:26.523115] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-6: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 > 02:46:26.523115] and [2017-01-10 02:46:26.523143] > > [2017-01-10 02:46:26.523147] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-6: > Failed to get size and version [Input/output error] > > [2017-01-10 02:46:26.523302] N [MSGID: 122031] > [ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2: > Mismatching dictionary in answers of 'GF_FOP_XATTROP' > > The message "N [MSGID: 122031] [ec-generic.c:1130:ec_combine_xattrop] > 0-StoragePool-disperse-2: Mismatching dictionary in answers of > 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10 > 02:46:26.523302] and [2017-01-10 02:46:26.523324] > > [2017-01-10 02:46:26.523328] W [MSGID: 122040] > [ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-2: > Failed to get size and version [Input/output error] > > > > [root at glusterfs3 Log_Files]# gluster --version > > glusterfs 3.7.18 built on Dec 8 2016 06:34:26 > > > > [root at glusterfs3 Log_Files]# gluster volume info > > > > Volume Name: StoragePool > > Type: Distributed-Disperse > > Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f > > Status: Started > > Number of Bricks: 8 x (2 + 1) = 24 > > Transport-type: tcp > > Bricks: > > Brick1: glusterfs1sds:/ws/disk1/ws_brick > > Brick2: glusterfs2sds:/ws/disk1/ws_brick > > Brick3: glusterfs3sds:/ws/disk1/ws_brick > > Brick4: glusterfs1sds:/ws/disk2/ws_brick > > Brick5: glusterfs2sds:/ws/disk2/ws_brick > > Brick6: glusterfs3sds:/ws/disk2/ws_brick > > Brick7: glusterfs1sds:/ws/disk3/ws_brick > > Brick8: glusterfs2sds:/ws/disk3/ws_brick > > Brick9: glusterfs3sds:/ws/disk3/ws_brick > > Brick10: glusterfs1sds:/ws/disk4/ws_brick > > Brick11: glusterfs2sds:/ws/disk4/ws_brick > > Brick12: glusterfs3sds:/ws/disk4/ws_brick > > Brick13: glusterfs1sds:/ws/disk5/ws_brick > > Brick14: glusterfs2sds:/ws/disk5/ws_brick > > Brick15: glusterfs3sds:/ws/disk5/ws_brick > > Brick16: glusterfs1sds:/ws/disk6/ws_brick > > Brick17: glusterfs2sds:/ws/disk6/ws_brick > > Brick18: glusterfs3sds:/ws/disk6/ws_brick > > Brick19: glusterfs1sds:/ws/disk7/ws_brick > > Brick20: glusterfs2sds:/ws/disk7/ws_brick > > Brick21: glusterfs3sds:/ws/disk7/ws_brick > > Brick22: glusterfs1sds:/ws/disk8/ws_brick > > Brick23: glusterfs2sds:/ws/disk8/ws_brick > > Brick24: glusterfs3sds:/ws/disk8/ws_brick > > Options Reconfigured: > > performance.readdir-ahead: on > > diagnostics.client-log-level: INFO > > > > Thanks and Regards, > > Ram > > ***************************Legal Disclaimer*************************** > "This communication may contain confidential and privileged material > for the sole use of the intended recipient. Any unauthorized review, > use or distribution by others is strictly prohibited. If you have > received the message by mistake, please advise the sender by reply > email and delete the message. Thank you." > ********************************************************************** > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >***************************Legal Disclaimer*************************** "This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you." **********************************************************************