Hoggins!
2018-Oct-07 16:14 UTC
[Gluster-users] "Solving" a recurrent "performing entry selfheal on [...]" on my bricks
Hello list, My Gluster cluster has a condition, I'd like to know how to cure it. The setup: two bricks, replicated, with an arbiter. On brick 1, the /var/log/glusterfs/glustershd.log is quite empty, not much activity, everything looks fine. On brick 2, /var/log/glusterfs/glustershd.log shows a lot of these: ??? [MSGID: 108026] [afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-mailer-replicate-0: performing entry selfheal on 9df5082b-d066-4659-91a4-5f2ad943ce51 ??? [MSGID: 108026] [afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-mailer-replicate-0: performing entry selfheal on ba8c0409-95f5-499d-8594-c6de15d5a585 These entries are repeated everyday, every ten minutes or so. Now if we list the contents of the directory represented by file ID 9df5082b-d066-4659-91a4-5f2ad943ce51: On brick 1: ??? drwx------. 2 1005 users 102400 13 sept. 17:03 cur ??? -rw-------. 2 1005 users???? 22 14 mars?? 2016 dovecot-keywords ??? -rw-------. 2 1005 users????? 0? 6 janv.? 2015 maildirfolder ??? drwx------. 2 1005 users????? 6 30 juin?? 2015 new ??? drwx------. 2 1005 users????? 6? 4 oct.? 17:46 tmp On brick 2: ??? drwx------. 2 1005 users 102400 25 mai?? 11:00 cur ??? -rw-------. 2 1005 users???? 22 14 mars?? 2016 dovecot-keywords ??? -rw-------. 2 1005 users? 80559 25 mai?? 11:00 dovecot-uidlist ??? -rw-------. 2 1005 users????? 0? 6 janv.? 2015 maildirfolder ??? drwx------. 2 1005 users????? 6 30 juin?? 2015 new ??? drwx------. 2 1005 users????? 6? 4 oct.? 17:46 tmp (note the "dovecot-uidlist" file present on brick 2 but not on brick 1) Also, checking directory sizes fur the cur/ directory: On brick 1: ??? 165872??? cur/ On brick 2: ??? 161516??? cur/ BUT the number of files is the same on the two bricks for the cur/ directory: ??? $~ ls -l cur/ | wc -l ??? 1135 So now you've got it: it's inconsistent between the two data bricks. On the arbiter, all seems good, the directory listing looks like what is on brick 2. Same kind of situation happens for file ID ba8c0409-95f5-499d-8594-c6de15d5a585. I'm sure that having this situation is not good and needs to be sorted out, so what can I do? Thanks for your help! ??? Hoggins! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP digital signature URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181007/affa6b69/attachment.sig>
Vlad Kopylov
2018-Oct-10 05:05 UTC
[Gluster-users] "Solving" a recurrent "performing entry selfheal on [...]" on my bricks
isn't it trying to heal your dovecot-uidlist? try updating, restarting and initiating heal again -v On Sun, Oct 7, 2018 at 12:54 PM Hoggins! <fuckspam at wheres5.com> wrote:> Hello list, > > My Gluster cluster has a condition, I'd like to know how to cure it. > > The setup: two bricks, replicated, with an arbiter. > On brick 1, the /var/log/glusterfs/glustershd.log is quite empty, not > much activity, everything looks fine. > On brick 2, /var/log/glusterfs/glustershd.log shows a lot of these: > [MSGID: 108026] [afr-self-heal-entry.c:887:afr_selfheal_entry_do] > 0-mailer-replicate-0: performing entry selfheal on > 9df5082b-d066-4659-91a4-5f2ad943ce51 > [MSGID: 108026] [afr-self-heal-entry.c:887:afr_selfheal_entry_do] > 0-mailer-replicate-0: performing entry selfheal on > ba8c0409-95f5-499d-8594-c6de15d5a585 > > These entries are repeated everyday, every ten minutes or so. > > Now if we list the contents of the directory represented by file ID > 9df5082b-d066-4659-91a4-5f2ad943ce51: > On brick 1: > drwx------. 2 1005 users 102400 13 sept. 17:03 cur > -rw-------. 2 1005 users 22 14 mars 2016 dovecot-keywords > -rw-------. 2 1005 users 0 6 janv. 2015 maildirfolder > drwx------. 2 1005 users 6 30 juin 2015 new > drwx------. 2 1005 users 6 4 oct. 17:46 tmp > > On brick 2: > drwx------. 2 1005 users 102400 25 mai 11:00 cur > -rw-------. 2 1005 users 22 14 mars 2016 dovecot-keywords > -rw-------. 2 1005 users 80559 25 mai 11:00 dovecot-uidlist > -rw-------. 2 1005 users 0 6 janv. 2015 maildirfolder > drwx------. 2 1005 users 6 30 juin 2015 new > drwx------. 2 1005 users 6 4 oct. 17:46 tmp > > (note the "dovecot-uidlist" file present on brick 2 but not on brick 1) > > Also, checking directory sizes fur the cur/ directory: > On brick 1: > 165872 cur/ > > On brick 2: > 161516 cur/ > > BUT the number of files is the same on the two bricks for the cur/ > directory: > $~ ls -l cur/ | wc -l > 1135 > > So now you've got it: it's inconsistent between the two data bricks. > > On the arbiter, all seems good, the directory listing looks like what is > on brick 2. > Same kind of situation happens for file ID > ba8c0409-95f5-499d-8594-c6de15d5a585. > > I'm sure that having this situation is not good and needs to be sorted > out, so what can I do? > > Thanks for your help! > > Hoggins! > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181010/a8a00201/attachment.html>