Nikhil Ladha
2021-Dec-27 13:03 UTC
[Gluster-users] Unable to upgrade nodes because of cksums mismatch
Hi Michael I think you are hitting a similar issue like this one https://github.com/gluster/glusterfs/issues/3066. If so, the fix for the same is under review and could be available in the next release. -- Thanks and Regards, *NiKHIL LADHA* On Mon, Dec 27, 2021 at 6:25 PM Michael B?hm <dudleyperkins at gmail.com> wrote:> Hey guys, > > i have a problem upgrading our nodes from 8.3 to 10.0 - i just upgraded > the first node and run into "the cksums mismatch" problem. On the upgraded > v10 node the checksums for all volumes are different than on the other v8 > nodes. That leads to the node starting in a peer rejected state. I can only > resolve this by following the actions supposed here: > > https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/ > (stopping glusterd, deleting /var/lib/glusterd/* (except glusterd.info), > start glusterd, probe a v8 peer, restart glusterd again) > > The cluster seems healthy again, self-healing is started and everything > looks fine - only the newly created cksums are still different than on the > other nodes. That means this healthy state only lasts till i reboot the > node - where it all begins from the start - the nodes comes up as peer > rejected. > > Now i'v read about the problem here: > https://github.com/gluster/glusterfs/issues/1332 (even though that > describes the problem should only occur when upgrading from earlier than v7) > or also here on the mailing list: > https://lists.gluster.org/pipermail/gluster-users/2021-November/039679.html > (i think i have the same problem, but unfortunately no solution given here) > > Solutions seem to require upgrading all nodes and the problem should be > resolved when finally upgrading op.version - but i dont' think this > approach can be done online, and there's not really a way for me to do this > offline. > > Why is this happening now and not when i upgraded from pre7 to 7? All my > nodes are 8.3 and op.version is 8000. > > One thing i might have done "wrong" - as i upgraded to v8 i didn't set > "gluster volume set <volname> fips-mode-rchecksum on" on the volumes, i > think i just overlooked it in the docs. I have this option only set on 2 > volumes i created after upgrading to v8. But even on those 2 the cksums > differ, so i guess it wouldn' help alot if i set the option on all other > volumes? > > I really don't know what to do now, i kinda understand the problem but > don't know why this is happening on a overall v8 cluster. I can't take all > 9 nodes down, upgrade all to v10 and rely on "it's all good" with the final > upgrade of op.version. > > Can someone point me in a safe direction? > > Regards > > Mika > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211227/1f0f4412/attachment.html>
Michael Böhm
2021-Dec-27 13:54 UTC
[Gluster-users] Unable to upgrade nodes because of cksums mismatch
Am Mo., 27. Dez. 2021 um 14:04 Uhr schrieb Nikhil Ladha <nladha at redhat.com>:> Hi Michael > > I think you are hitting a similar issue like this one > https://github.com/gluster/glusterfs/issues/3066. > If so, the fix for the same is under review and could be available in the > next release. >Thanks for your reply. I saw that issue and thought it doesn't really apply to me cause i don't use striping (only Replicate and one Distributed-Replicate). But maybe, if this is a cause for the mismatched cksum, it could be the root cause. I always hesitate to upgrade to a x.0 version - but as those holidays are the best time for these upgrades i thought i might skip v9 this time. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211227/65ba04a5/attachment.html>
Ingo Fischer
2021-Dec-27 14:57 UTC
[Gluster-users] Unable to upgrade nodes because of cksums mismatch
Hi, in former releases like 9.x there was alwys a "major issues" section in the release notes like https://docs.gluster.org/en/latest/release-notes/9.0/#major-issues 8especially in ".0" releases. Should this not be added to such a section in the 10.0 release notes to inform users that want to upgrade? Ingo Am 27.12.21 um 14:03 schrieb Nikhil Ladha:> Hi Michael > > I think you are hitting a similar issue like this one > https://github.com/gluster/glusterfs/issues/3066 > <https://github.com/gluster/glusterfs/issues/3066>. > If so, the fix for the same is under review and could be available in > the next release. > > -- > Thanks and Regards, > *NiKHIL LADHA* > > > On Mon, Dec 27, 2021 at 6:25 PM Michael B?hm <dudleyperkins at gmail.com > <mailto:dudleyperkins at gmail.com>> wrote: > > Hey guys, > > i have a problem upgrading our nodes from 8.3 to 10.0 - i just > upgraded the first node and run into "the cksums mismatch" problem. > On the upgraded v10 node the checksums for all volumes are different > than on the other v8 nodes. That leads to the node starting in a > peer rejected state. I can only resolve this by following the > actions supposed here: > https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/ > <https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/> > (stopping glusterd, deleting /var/lib/glusterd/* (except > glusterd.info <http://glusterd.info>), start glusterd, probe a v8 > peer, restart glusterd again) > > The cluster seems healthy again, self-healing is started and > everything looks fine - only the newly created cksums are still > different than on the other nodes. That means this healthy > state?only lasts till i reboot the node - where it all begins from > the start - the nodes comes up as peer rejected. > > Now i'v read about the problem here: > https://github.com/gluster/glusterfs/issues/1332 > <https://github.com/gluster/glusterfs/issues/1332> (even though that > describes the problem should only occur when upgrading from earlier > than v7) > or also here on the mailing list: > https://lists.gluster.org/pipermail/gluster-users/2021-November/039679.html > <https://lists.gluster.org/pipermail/gluster-users/2021-November/039679.html> > (i think i have the same problem, but unfortunately no solution > given here) > > Solutions seem to require upgrading all nodes and the problem should > be resolved when finally upgrading op.version - but i dont' think > this approach can be done online, and there's not really a way for > me to do this offline. > > Why is this happening now and not when i upgraded from pre7 to 7? > All my nodes are 8.3 and op.version is 8000. > > One thing i might have done "wrong" - as i upgraded to v8 i didn't > set "gluster volume set <volname> fips-mode-rchecksum on" on the > volumes, i think i just overlooked it in the docs. I have this > option only set on 2 volumes i created after upgrading to v8. But > even on those 2 the cksums differ, so i guess it wouldn' help alot > if i set the option on all other volumes? > > I really don't know what to do now, i kinda understand the problem > but don't know why this is happening on a overall v8 cluster. I > can't take all 9 nodes down, upgrade all to v10 and rely on "it's > all good" with the final upgrade of op.version. > > Can someone point me in a safe direction? > > Regards > > Mika > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > <https://meet.google.com/cpu-eiue-hvk> > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://lists.gluster.org/mailman/listinfo/gluster-users> > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users