Ankireddypalle Reddy
2017-Jul-07 15:09 UTC
[Gluster-users] gfid and volume-id extended attributes lost
Hi, We faced an issue in the production today. We had to stop the volume and reboot all the servers in the cluster. Once the servers rebooted starting of the volume failed because the following extended attributes were not present on all the bricks on 2 servers. 1) trusted.gfid 2) trusted.glusterfs.volume-id We had to manually set these extended attributes to start the volume. Are there any such known issues. Thanks and Regards, Ram ***************************Legal Disclaimer*************************** "This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you." ********************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/cd740e84/attachment.html>
Pranith Kumar Karampuri
2017-Jul-07 15:45 UTC
[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost
Did anything special happen on these two bricks? It can't happen in the I/O path: posix_removexattr() has: 0 if (!strcmp (GFID_XATTR_KEY, name)) { 1 gf_msg (this->name, GF_LOG_WARNING, 0, P_MSG_XATTR_NOT_REMOVED, 2 "Remove xattr called on gfid for file %s", real_path); 3 op_ret -1; 4 goto out; 5 } 6 if (!strcmp (GF_XATTR_VOL_ID_KEY, name)) { 7 gf_msg (this->name, GF_LOG_WARNING, 0, P_MSG_XATTR_NOT_REMOVED, 8 "Remove xattr called on volume-id for file %s", 9 real_path); 10 op_ret -1; 11 goto out; 12 } I just found that op_errno is not set correctly, but it can't happen in the I/O path, so self-heal/rebalance are off the hook. I also grepped for any removexattr of trusted.gfid from glusterd and didn't find any. So one thing that used to happen was that sometimes when machines reboot, the brick mounts wouldn't happen and this would lead to absence of both trusted.gfid and volume-id. So at the moment this is my wild guess. On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy <areddy at commvault.com> wrote:> Hi, > > We faced an issue in the production today. We had to stop the > volume and reboot all the servers in the cluster. Once the servers > rebooted starting of the volume failed because the following extended > attributes were not present on all the bricks on 2 servers. > > 1) trusted.gfid > > 2) trusted.glusterfs.volume-id > > > > We had to manually set these extended attributes to start the volume. Are > there any such known issues. > > > > Thanks and Regards, > > Ram > ***************************Legal Disclaimer*************************** > "This communication may contain confidential and privileged material for > the > sole use of the intended recipient. Any unauthorized review, use or > distribution > by others is strictly prohibited. If you have received the message by > mistake, > please advise the sender by reply email and delete the message. Thank you." > ********************************************************************** > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/332319a3/attachment.html>
Apparently Analagous Threads
- [Gluster-devel] gfid and volume-id extended attributes lost
- [Gluster-devel] gfid and volume-id extended attributes lost
- [Gluster-devel] gfid and volume-id extended attributes lost
- [Gluster-devel] gfid and volume-id extended attributes lost
- [Gluster-devel] gfid and volume-id extended attributes lost