Hu Bert
2020-Jun-08 09:48 UTC
[Gluster-users] One error/warning message after upgrade 5.11 -> 6.8
Hi Strahil, thx for your answer, but i assume that your approach won't help. It seems like that this behaviour is permanent; e.g. a log entry like this: [2020-06-08 09:40:03.948269] E [MSGID: 113001] [posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: file: /gluster/md3/persistent/.glusterfs/38/30/38306ef8-6588-40cf-8be3-c0a022714612: gfid: 38306ef8-6588-40cf-8be3-c0a022714612 key:trusted.glusterfs.mdata [No such file or directory] [2020-06-08 09:40:03.948333] E [MSGID: 113114] [posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] 0-persistent-posix: gfid: 38306ef8-6588-40cf-8be3-c0a022714612 key:trusted.glusterfs.mdata [No such file or directory] [2020-06-08 09:40:03.948422] I [MSGID: 115060] [server-rpc-fops.c:938:_gf_server_log_setxattr_failure] 0-persistent-server: 14193413: SETXATTR /images/generated/207/039/2070391/484x425r.jpg (38306ef8-6588-40cf-8be3-c0a022714612) ==> set-ctime-mdata, client: CTX_ID:b738017c-20a3-4547-afba-5b8933d8e6e5-GRAPH_ID:0-PID:1078-HOST:pepe-PC_NAME:persistent-client-2-RECON_NO:-1, error-xlator: persistent-posix tells me that an error (ctime-mdata) is found and fixed. And this is happening over and over again. A couple of minutes ago i wanted to begin with what you suggested and called 'gluster volume heal persistent info' and suddenly saw: Brick gluster1:/gluster/md3/persistent Status: Connected Number of entries: 0 Brick gluster2:/gluster/md3/persistent Status: Connected Number of entries: 0 Brick gluster3:/gluster/md3/persistent Status: Connected Number of entries: 0 I thought 'wtf...'; the heal-count was 0 as well; but the next call ~15s later showed this again: Brick gluster1:/gluster/md3/persistent Number of entries: 31 Brick gluster2:/gluster/md3/persistent Number of entries: 27 Brick gluster3:/gluster/md3/persistent Number of entries: 4 For me it looks like the 'error found -> heal it' process works as it should, but due to the permanent errors (log file entries) the heal count of zero is almost impossible to read. Well, one could deactivate features.ctime as this seems to be the reason (as the log entries suggest), but i don't know if that is reasonable, i.e. if this feature is needed. Best regards, Hubert Am Mo., 8. Juni 2020 um 11:22 Uhr schrieb Strahil Nikolov <hunter86_bg at yahoo.com>:> > Hi Hubert, > > Here is one idea: > Using 'gluster volume heal VOL info' can provide the gfids of files pending heal. > Once you have them, you can find the inode of each file via 'ls -li /gluster/brick/.gfid/<first_two_characters_of_gfid>/<next_two_characters>/gfid > > Then you can search the brick with find for that inode number (don't forget the 'ionice' to reduce the pressure). > > Once you have the list of files, stat them via the FUSE client and check if they got healed. > > I fully agree that you need to first heal the golumes before proceeding further or you might get into a nasty situation. > > Best Regards, > Strahil Nikolov > > > ?? 8 ??? 2020 ?. 8:30:57 GMT+03:00, Hu Bert <revirii at googlemail.com> ??????: > >Good morning, > > > >i just wanted to update the version from 6.8 to 6.9 on our replicate 3 > >system (formerly was version 5.11), and i see tons of these messages: > > > >[2020-06-08 05:25:55.192301] E [MSGID: 113001] > >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: > >file: > >/gluster/md3/persistent/.glusterfs/43/31/43312aba-75c6-42c2-855c-e0db66d7748f: > >gfid: 43312aba-75c6-42c2-855c-e0db66d7748f key:trusted.glusterfs.mdata > > [No such file or directory] > >[2020-06-08 05:25:55.192375] E [MSGID: 113114] > >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] > >0-persistent-posix: gfid: 43312aba-75c6-42c2-855c-e0db66d7748f > >key:trusted.glusterfs.mdata [No such file or directory] > >[2020-06-08 05:25:55.192426] I [MSGID: 115060] > >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure] > >0-persistent-server: 13382741: SETXATTR > ><gfid:43312aba-75c6-42c2-855c-e0db66d7748f> > >(43312aba-75c6-42c2-855c-e0db66d7748f) ==> set-ctime-mdata, client: > >CTX_ID:e223ca30-6c30-4a40-ae98-a418143ce548-GRAPH_ID:0-PID:1006-HOST:sam-PC_NAME:persistent-client-2-RECON_NO:-1, > >error-xlator: persistent-posix > > > >Still the ctime-message. And a lot of these messages: > > > >[2020-06-08 05:25:53.016606] W [MSGID: 101159] > >[inode.c:1330:__inode_unlink] 0-inode: > >7043eed7-dbd7-4277-976f-d467349c1361/21194684.jpg: dentry not found in > >839512f0-75de-414f-993d-1c35892f8560 > > > >Well... the problem is: the volume seems to be in a permanent heal > >status: > > > >Gathering count of entries to be healed on volume persistent has been > >successful > >Brick gluster1:/gluster/md3/persistent > >Number of entries: 31 > >Brick gluster2:/gluster/md3/persistent > >Number of entries: 6 > >Brick gluster3:/gluster/md3/persistent > >Number of entries: 5 > > > >a bit later: > >Gathering count of entries to be healed on volume persistent has been > >successful > >Brick gluster1:/gluster/md3/persistent > >Number of entries: 100 > >Brick gluster2:/gluster/md3/persistent > >Number of entries: 74 > >Brick gluster3:/gluster/md3/persistent > >Number of entries: 1 > > > >The number of entries never reach 0-0-0; i already updated one of the > >systems from 6.8 to 6.9, but updating the other 2 when heal isn't zero > >doesn't seem to be a good idea. Well... any idea? > > > > > >Best regards, > >Hubert > > > >Am Fr., 8. Mai 2020 um 21:47 Uhr schrieb Strahil Nikolov > ><hunter86_bg at yahoo.com>: > >> > >> On April 21, 2020 8:00:32 PM GMT+03:00, Amar Tumballi > ><amar at kadalu.io> wrote: > >> >There seems to be a burst of issues when people upgraded to 5.x or > >6.x > >> >from > >> >3.12 (Thanks to you and Strahil, who have reported most of them). > >> > > >> >Latest update from Strahil is that if files are copied fresh on 7.5 > >> >series, > >> >there are no issues. > >> > > >> >We are in process of identifying the patch, and also provide an > >option > >> >to > >> >disable 'acl' for testing. Will update once we identify the issue. > >> > > >> >Regards, > >> >Amar > >> > > >> > > >> > > >> >On Sat, Apr 11, 2020 at 11:10 AM Hu Bert <revirii at googlemail.com> > >> >wrote: > >> > > >> >> Hi, > >> >> > >> >> no one has seen such messages? > >> >> > >> >> Regards, > >> >> Hubert > >> >> > >> >> Am Mo., 6. Apr. 2020 um 06:13 Uhr schrieb Hu Bert > >> ><revirii at googlemail.com > >> >> >: > >> >> > > >> >> > Hello, > >> >> > > >> >> > i just upgraded my servers and clients from 5.11 to 6.8; besides > >> >one > >> >> > connection problem to the gluster download server everything > >went > >> >> > fine. > >> >> > > >> >> > On the 3 gluster servers i mount the 2 volumes as well, and only > >> >there > >> >> > (and not on all the other clients) there are some messages in > >the > >> >log > >> >> > file of both mount logs: > >> >> > > >> >> > [2020-04-06 04:10:53.552561] W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-2: remote operation failed [Permission > >denied] > >> >> > [2020-04-06 04:10:53.552635] W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-1: remote operation failed [Permission > >denied] > >> >> > [2020-04-06 04:10:53.552639] W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-0: remote operation failed [Permission > >denied] > >> >> > [2020-04-06 04:10:53.553226] E [MSGID: 148002] > >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] > >0-persistent-utime: > >> >dict > >> >> > set of key for set-ctime-mdata failed [Permission denied] > >> >> > The message "W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-2: remote operation failed [Permission > >denied]" > >> >> > repeated 4 times between [2020-04-06 04:10:53.552561] and > >> >[2020-04-06 > >> >> > 04:10:53.745542] > >> >> > The message "W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-1: remote operation failed [Permission > >denied]" > >> >> > repeated 4 times between [2020-04-06 04:10:53.552635] and > >> >[2020-04-06 > >> >> > 04:10:53.745610] > >> >> > The message "W [MSGID: 114031] > >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> > 0-persistent-client-0: remote operation failed [Permission > >denied]" > >> >> > repeated 4 times between [2020-04-06 04:10:53.552639] and > >> >[2020-04-06 > >> >> > 04:10:53.745632] > >> >> > The message "E [MSGID: 148002] > >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] > >0-persistent-utime: > >> >dict > >> >> > set of key for set-ctime-mdata failed [Permission denied]" > >repeated > >> >4 > >> >> > times between [2020-04-06 04:10:53.553226] and [2020-04-06 > >> >> > 04:10:53.746080] > >> >> > > >> >> > Anything to worry about? > >> >> > > >> >> > > >> >> > Regards, > >> >> > Hubert > >> >> ________ > >> >> > >> >> > >> >> > >> >> Community Meeting Calendar: > >> >> > >> >> Schedule - > >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > >> >> Bridge: https://bluejeans.com/441850968 > >> >> > >> >> Gluster-users mailing list > >> >> Gluster-users at gluster.org > >> >> https://lists.gluster.org/mailman/listinfo/gluster-users > >> >> > >> > >> Hi, > >> > >> Can you provide the xfs_info for the bricks from the volume ? > >> > >> I have a theory that I want to confirm or reject. > >> > >> Best Regards, > >> Strahil Nikolov
Strahil Nikolov
2020-Jun-08 13:36 UTC
[Gluster-users] One error/warning message after upgrade 5.11 -> 6.8
Hm... That's something I didn't expect. By the way, have you checked if all clients are connected to all bricks (if using FUSE)? Maybe you have some clients that cannot reach a brick. Best Regards, Strahil Nikolov ?? 8 ??? 2020 ?. 12:48:22 GMT+03:00, Hu Bert <revirii at googlemail.com> ??????:>Hi Strahil, > >thx for your answer, but i assume that your approach won't help. It >seems like that this behaviour is permanent; e.g. a log entry like >this: > >[2020-06-08 09:40:03.948269] E [MSGID: 113001] >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: >file: >/gluster/md3/persistent/.glusterfs/38/30/38306ef8-6588-40cf-8be3-c0a022714612: >gfid: 38306ef8-6588-40cf-8be3-c0a022714612 key:trusted.glusterfs.mdata > [No such file or directory] >[2020-06-08 09:40:03.948333] E [MSGID: 113114] >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] >0-persistent-posix: gfid: 38306ef8-6588-40cf-8be3-c0a022714612 >key:trusted.glusterfs.mdata [No such file or directory] >[2020-06-08 09:40:03.948422] I [MSGID: 115060] >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure] >0-persistent-server: 14193413: SETXATTR >/images/generated/207/039/2070391/484x425r.jpg >(38306ef8-6588-40cf-8be3-c0a022714612) ==> set-ctime-mdata, client: >CTX_ID:b738017c-20a3-4547-afba-5b8933d8e6e5-GRAPH_ID:0-PID:1078-HOST:pepe-PC_NAME:persistent-client-2-RECON_NO:-1, >error-xlator: persistent-posix > >tells me that an error (ctime-mdata) is found and fixed. And this is >happening over and over again. A couple of minutes ago i wanted to >begin with what you suggested and called 'gluster volume heal >persistent info' and suddenly saw: > >Brick gluster1:/gluster/md3/persistent >Status: Connected >Number of entries: 0 > >Brick gluster2:/gluster/md3/persistent >Status: Connected >Number of entries: 0 > >Brick gluster3:/gluster/md3/persistent >Status: Connected >Number of entries: 0 > >I thought 'wtf...'; the heal-count was 0 as well; but the next call >~15s later showed this again: > >Brick gluster1:/gluster/md3/persistent >Number of entries: 31 > >Brick gluster2:/gluster/md3/persistent >Number of entries: 27 > >Brick gluster3:/gluster/md3/persistent >Number of entries: 4 > >For me it looks like the 'error found -> heal it' process works as it >should, but due to the permanent errors (log file entries) the heal >count of zero is almost impossible to read. > >Well, one could deactivate features.ctime as this seems to be the >reason (as the log entries suggest), but i don't know if that is >reasonable, i.e. if this feature is needed. > > >Best regards, >Hubert > >Am Mo., 8. Juni 2020 um 11:22 Uhr schrieb Strahil Nikolov ><hunter86_bg at yahoo.com>: >> >> Hi Hubert, >> >> Here is one idea: >> Using 'gluster volume heal VOL info' can provide the gfids of >files pending heal. >> Once you have them, you can find the inode of each file via 'ls -li >/gluster/brick/.gfid/<first_two_characters_of_gfid>/<next_two_characters>/gfid >> >> Then you can search the brick with find for that inode number (don't >forget the 'ionice' to reduce the pressure). >> >> Once you have the list of files, stat them via the FUSE client and >check if they got healed. >> >> I fully agree that you need to first heal the golumes before >proceeding further or you might get into a nasty situation. >> >> Best Regards, >> Strahil Nikolov >> >> >> ?? 8 ??? 2020 ?. 8:30:57 GMT+03:00, Hu Bert <revirii at googlemail.com> >??????: >> >Good morning, >> > >> >i just wanted to update the version from 6.8 to 6.9 on our replicate >3 >> >system (formerly was version 5.11), and i see tons of these >messages: >> > >> >[2020-06-08 05:25:55.192301] E [MSGID: 113001] >> >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: >> >file: >> >>/gluster/md3/persistent/.glusterfs/43/31/43312aba-75c6-42c2-855c-e0db66d7748f: >> >gfid: 43312aba-75c6-42c2-855c-e0db66d7748f >key:trusted.glusterfs.mdata >> > [No such file or directory] >> >[2020-06-08 05:25:55.192375] E [MSGID: 113114] >> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] >> >0-persistent-posix: gfid: 43312aba-75c6-42c2-855c-e0db66d7748f >> >key:trusted.glusterfs.mdata [No such file or directory] >> >[2020-06-08 05:25:55.192426] I [MSGID: 115060] >> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure] >> >0-persistent-server: 13382741: SETXATTR >> ><gfid:43312aba-75c6-42c2-855c-e0db66d7748f> >> >(43312aba-75c6-42c2-855c-e0db66d7748f) ==> set-ctime-mdata, client: >> >>CTX_ID:e223ca30-6c30-4a40-ae98-a418143ce548-GRAPH_ID:0-PID:1006-HOST:sam-PC_NAME:persistent-client-2-RECON_NO:-1, >> >error-xlator: persistent-posix >> > >> >Still the ctime-message. And a lot of these messages: >> > >> >[2020-06-08 05:25:53.016606] W [MSGID: 101159] >> >[inode.c:1330:__inode_unlink] 0-inode: >> >7043eed7-dbd7-4277-976f-d467349c1361/21194684.jpg: dentry not found >in >> >839512f0-75de-414f-993d-1c35892f8560 >> > >> >Well... the problem is: the volume seems to be in a permanent heal >> >status: >> > >> >Gathering count of entries to be healed on volume persistent has >been >> >successful >> >Brick gluster1:/gluster/md3/persistent >> >Number of entries: 31 >> >Brick gluster2:/gluster/md3/persistent >> >Number of entries: 6 >> >Brick gluster3:/gluster/md3/persistent >> >Number of entries: 5 >> > >> >a bit later: >> >Gathering count of entries to be healed on volume persistent has >been >> >successful >> >Brick gluster1:/gluster/md3/persistent >> >Number of entries: 100 >> >Brick gluster2:/gluster/md3/persistent >> >Number of entries: 74 >> >Brick gluster3:/gluster/md3/persistent >> >Number of entries: 1 >> > >> >The number of entries never reach 0-0-0; i already updated one of >the >> >systems from 6.8 to 6.9, but updating the other 2 when heal isn't >zero >> >doesn't seem to be a good idea. Well... any idea? >> > >> > >> >Best regards, >> >Hubert >> > >> >Am Fr., 8. Mai 2020 um 21:47 Uhr schrieb Strahil Nikolov >> ><hunter86_bg at yahoo.com>: >> >> >> >> On April 21, 2020 8:00:32 PM GMT+03:00, Amar Tumballi >> ><amar at kadalu.io> wrote: >> >> >There seems to be a burst of issues when people upgraded to 5.x >or >> >6.x >> >> >from >> >> >3.12 (Thanks to you and Strahil, who have reported most of them). >> >> > >> >> >Latest update from Strahil is that if files are copied fresh on >7.5 >> >> >series, >> >> >there are no issues. >> >> > >> >> >We are in process of identifying the patch, and also provide an >> >option >> >> >to >> >> >disable 'acl' for testing. Will update once we identify the >issue. >> >> > >> >> >Regards, >> >> >Amar >> >> > >> >> > >> >> > >> >> >On Sat, Apr 11, 2020 at 11:10 AM Hu Bert <revirii at googlemail.com> >> >> >wrote: >> >> > >> >> >> Hi, >> >> >> >> >> >> no one has seen such messages? >> >> >> >> >> >> Regards, >> >> >> Hubert >> >> >> >> >> >> Am Mo., 6. Apr. 2020 um 06:13 Uhr schrieb Hu Bert >> >> ><revirii at googlemail.com >> >> >> >: >> >> >> > >> >> >> > Hello, >> >> >> > >> >> >> > i just upgraded my servers and clients from 5.11 to 6.8; >besides >> >> >one >> >> >> > connection problem to the gluster download server everything >> >went >> >> >> > fine. >> >> >> > >> >> >> > On the 3 gluster servers i mount the 2 volumes as well, and >only >> >> >there >> >> >> > (and not on all the other clients) there are some messages in >> >the >> >> >log >> >> >> > file of both mount logs: >> >> >> > >> >> >> > [2020-04-06 04:10:53.552561] W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-2: remote operation failed [Permission >> >denied] >> >> >> > [2020-04-06 04:10:53.552635] W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-1: remote operation failed [Permission >> >denied] >> >> >> > [2020-04-06 04:10:53.552639] W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-0: remote operation failed [Permission >> >denied] >> >> >> > [2020-04-06 04:10:53.553226] E [MSGID: 148002] >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] >> >0-persistent-utime: >> >> >dict >> >> >> > set of key for set-ctime-mdata failed [Permission denied] >> >> >> > The message "W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-2: remote operation failed [Permission >> >denied]" >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552561] and >> >> >[2020-04-06 >> >> >> > 04:10:53.745542] >> >> >> > The message "W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-1: remote operation failed [Permission >> >denied]" >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552635] and >> >> >[2020-04-06 >> >> >> > 04:10:53.745610] >> >> >> > The message "W [MSGID: 114031] >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] >> >> >> > 0-persistent-client-0: remote operation failed [Permission >> >denied]" >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552639] and >> >> >[2020-04-06 >> >> >> > 04:10:53.745632] >> >> >> > The message "E [MSGID: 148002] >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] >> >0-persistent-utime: >> >> >dict >> >> >> > set of key for set-ctime-mdata failed [Permission denied]" >> >repeated >> >> >4 >> >> >> > times between [2020-04-06 04:10:53.553226] and [2020-04-06 >> >> >> > 04:10:53.746080] >> >> >> > >> >> >> > Anything to worry about? >> >> >> > >> >> >> > >> >> >> > Regards, >> >> >> > Hubert >> >> >> ________ >> >> >> >> >> >> >> >> >> >> >> >> Community Meeting Calendar: >> >> >> >> >> >> Schedule - >> >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> >> >> Bridge: https://bluejeans.com/441850968 >> >> >> >> >> >> Gluster-users mailing list >> >> >> Gluster-users at gluster.org >> >> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> >> >> Hi, >> >> >> >> Can you provide the xfs_info for the bricks from the volume ? >> >> >> >> I have a theory that I want to confirm or reject. >> >> >> >> Best Regards, >> >> Strahil Nikolov