lejeczek
2020-Jun-30 10:13 UTC
[Gluster-users] volume process does not start - glusterfs is happy with it?
An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200630/d6072727/attachment.html>
Barak Sason Rofman
2020-Jun-30 10:31 UTC
[Gluster-users] volume process does not start - glusterfs is happy with it?
Greetings, I'm not sure if that's directly related to your problem, but on a general level, AFAIK, replica-2 vols are not recommended due to split brain possibility: https://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/ It's recommended to either use replica-3 or arbiter Arbiter. Regards, <https://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/> On Tue, Jun 30, 2020 at 1:14 PM lejeczek <peljasz at yahoo.co.uk> wrote:> Hi everybody. > > I have two peers in the cluster and a 2-replica volume which seems okey if > it was not for one weird bit - when a peer reboots then on that peer after > a reboot I see: > > $ gluster volume status USERs > Status of volume: USERs > Gluster process TCP Port RDMA Port Online > Pid > > ------------------------------------------------------------------------------ > Brick swir.direct:/00.STORAGE/2/0-GLUSTER-U > SERs N/A N/A N > N/A > Brick dzien.direct:/00.STORAGE/2/0-GLUSTER- > USERs 49152 0 Y > 57338 > Self-heal Daemon on localhost N/A N/A Y > 4302 > Self-heal Daemon on dzien.direct N/A N/A Y > 57359 > > Task Status of Volume USERs > > ------------------------------------------------------------------------------ > There are no active volume tasks > > I do not suppose it's expected. > On such rebooted node I see: > $ systemctl status -l glusterd > ? glusterd.service - GlusterFS, a clustered file-system server > Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; > vendor preset: enabled) > Drop-In: /etc/systemd/system/glusterd.service.d > ??override.conf > Active: active (running) since Mon 2020-06-29 21:37:36 BST; 13h ago > Docs: man:glusterd(8) > Process: 4071 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid > --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status> > Main PID: 4086 (glusterd) > Tasks: 20 (limit: 101792) > Memory: 28.9M > CGroup: /system.slice/glusterd.service > ??4086 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level > INFO > ??4302 /usr/sbin/glusterfs -s localhost --volfile-id shd/USERs > -p /var/run/gluster/shd/USERs/USERs-shd.pid -l /var/log/g> > > Jun 29 21:37:36 swir.private.pawel systemd[1]: Starting GlusterFS, a > clustered file-system server... > Jun 29 21:37:36 swir.private.pawel systemd[1]: Started GlusterFS, a > clustered file-system server. > > And I do not see any other apparent problems nor errors. > On that node I manually: > $ systemctl restart glusterd.service > and... > > $ gluster volume status USERs > Status of volume: USERs > Gluster process TCP Port RDMA Port Online > Pid > > ------------------------------------------------------------------------------ > Brick swir.direct:/00.STORAGE/2/0-GLUSTER-U > SERs 49152 0 Y > 103225 > Brick dzien.direct:/00.STORAGE/2/0-GLUSTER- > USERs 49152 0 Y > 57338 > Self-heal Daemon on localhost N/A N/A Y > 103270 > Self-heal Daemon on dzien.direct N/A N/A Y > 57359 > > Is not a puzzle??? I'm on glusterfs-7.6-1.el8.x86_64 > I hope somebody can share some thoughts. > many thanks, L. > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-- *Barak Sason Rofman* Gluster Storage Development Red Hat Israel <https://www.redhat.com/> 34 Jerusalem rd. Ra'anana, 43501 bsasonro at redhat.com <adi at redhat.com> T: *+972-9-7692304* M: *+972-52-4326355* @RedHat <https://twitter.com/redhat> Red Hat <https://www.linkedin.com/company/red-hat> Red Hat <https://www.facebook.com/redhat.il/> <https://red.ht/sig> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200630/61be822e/attachment.html>
lejeczek
2020-Jul-01 14:46 UTC
[Gluster-users] volume process does not start - glusterfs is happy with it?
On 30/06/2020 11:31, Barak Sason Rofman wrote:> Greetings, > > I'm not sure if that's directly related to your problem, > but on a general level, AFAIK, replica-2 vols are not > recommended due to split brain possibility: > https://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/ > > It's recommended to either use replica-3 or arbiter Arbiter. > > Regards, > > On Tue, Jun 30, 2020 at 1:14 PM lejeczek > <peljasz at yahoo.co.uk <mailto:peljasz at yahoo.co.uk>> wrote: > > Hi everybody. > > I have two peers in the cluster and a 2-replica volume > which seems okey if it was not for one weird bit - > when a peer reboots then on that peer after a reboot I > see: > > $ gluster volume status USERs > Status of volume: USERs > Gluster process???????????????????????????? TCP Port? > RDMA Port? Online? Pid > ------------------------------------------------------------------------------ > Brick swir.direct:/00.STORAGE/2/0-GLUSTER-U > SERs??????????????????????????????????????? N/A?????? > N/A??????? N?????? N/A? > Brick dzien.direct:/00.STORAGE/2/0-GLUSTER- > USERs?????????????????????????????????????? 49152???? > 0????????? Y?????? 57338 > Self-heal Daemon on localhost?????????????? N/A?????? > N/A??????? Y?????? 4302 > Self-heal Daemon on dzien.direct??????????? N/A?????? > N/A??????? Y?????? 57359 > ? > Task Status of Volume USERs > ------------------------------------------------------------------------------ > There are no active volume tasks > > I do not suppose it's expected. > On such rebooted node I see: > $ systemctl status -l glusterd > ? glusterd.service - GlusterFS, a clustered > file-system server > ?? Loaded: loaded > (/usr/lib/systemd/system/glusterd.service; enabled; > vendor preset: enabled) > ? Drop-In: /etc/systemd/system/glusterd.service.d > ?????????? ??override.conf > ?? Active: active (running) since Mon 2020-06-29 > 21:37:36 BST; 13h ago > ???? Docs: man:glusterd(8) > ? Process: 4071 ExecStart=/usr/sbin/glusterd -p > /var/run/glusterd.pid --log-level $LOG_LEVEL > $GLUSTERD_OPTIONS (code=exited, status> > ?Main PID: 4086 (glusterd) > ??? Tasks: 20 (limit: 101792) > ?? Memory: 28.9M > ?? CGroup: /system.slice/glusterd.service > ?????????? ??4086 /usr/sbin/glusterd -p > /var/run/glusterd.pid --log-level INFO > ?????????? ??4302 /usr/sbin/glusterfs -s localhost > --volfile-id shd/USERs -p > /var/run/gluster/shd/USERs/USERs-shd.pid -l /var/log/g> > > Jun 29 21:37:36 swir.private.pawel systemd[1]: > Starting GlusterFS, a clustered file-system server... > Jun 29 21:37:36 swir.private.pawel systemd[1]: Started > GlusterFS, a clustered file-system server. > > And I do not see any other apparent problems nor errors. > On that node I manually: > $ systemctl restart glusterd.service > and... > > $ gluster volume status USERs > Status of volume: USERs > Gluster process???????????????????????????? TCP Port? > RDMA Port? Online? Pid > ------------------------------------------------------------------------------ > Brick swir.direct:/00.STORAGE/2/0-GLUSTER-U > SERs??????????????????????????????????????? 49152???? > 0????????? Y?????? 103225 > Brick dzien.direct:/00.STORAGE/2/0-GLUSTER- > USERs?????????????????????????????????????? 49152???? > 0????????? Y?????? 57338 > Self-heal Daemon on localhost?????????????? N/A?????? > N/A??????? Y?????? 103270 > Self-heal Daemon on dzien.direct??????????? N/A?????? > N/A??????? Y?????? 57359 > > Is not a puzzle??? I'm on glusterfs-7.6-1.el8.x86_64 > I hope somebody can share some thoughts. > many thanks, L. >That cannot be it!? If the root cause of this problem is 2-replica volume then it would be a massive cock-up! Then 2-volume replica should be banned and forbidden. I hope some can suggest a way to troubleshoot it. ps. we all, I presume all, know problems of 2-replica volumes. many thanks, L.> ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > *Barak Sason Rofman* > > Gluster Storage?Development > > Red Hat?Israel <https://www.redhat.com/> > > 34 Jerusalem rd. Ra'anana, 43501 > > bsasonro at redhat.com <mailto:adi at redhat.com>? > ??T:?_+972-9-7692304_ > M:?_+972-52-4326355_ > > @RedHat <https://twitter.com/redhat>???Red Hat > <https://www.linkedin.com/company/red-hat>??Red Hat > <https://www.facebook.com/redhat.il/> > <https://red.ht/sig> >