thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Pranith Kumar Karampuri

2017-Jul-07 15:45 UTC

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

Did anything special happen on these two bricks? It can't happen in the I/O
path:
posix_removexattr() has:
  0         if (!strcmp (GFID_XATTR_KEY, name))
{

  1                 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  2                         "Remove xattr called on gfid for file %s",
real_path);
  3                 op_ret -1;
  4                 goto
out;
  5
}
  6         if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
{
  7                 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  8                         "Remove xattr called on volume-id for file
%s",
  9
real_path);
 10                 op_ret -1;
 11                 goto
out;
 12         }

I just found that op_errno is not set correctly, but it can't happen in the
I/O path, so self-heal/rebalance are off the hook.

I also grepped for any removexattr of trusted.gfid from glusterd and didn't
find any.

So one thing that used to happen was that sometimes when machines reboot,
the brick mounts wouldn't happen and this would lead to absence of both
trusted.gfid and volume-id. So at the moment this is my wild guess.


On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy <areddy at
commvault.com>
wrote:
> Hi,
>
>        We faced an issue in the production today. We had to stop the
> volume and reboot all the servers in the cluster.  Once the servers
> rebooted starting of the volume failed because the following extended
> attributes were not present on all the bricks on 2 servers.
>
> 1)      trusted.gfid
>
> 2)      trusted.glusterfs.volume-id
>
>
>
> We had to manually set these extended attributes to start the volume.  Are
> there any such known issues.
>
>
>
> Thanks and Regards,
>
> Ram
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank
you."
> **********************************************************************
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/332319a3/attachment.html>

Pranith Kumar Karampuri

2017-Jul-07 15:47 UTC

head link

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

On Fri, Jul 7, 2017 at 9:15 PM, Pranith Kumar Karampuri <pkarampu at
redhat.com> wrote:
> Did anything special happen on these two bricks? It can't happen in the
> I/O path:
> posix_removexattr() has:
>   0         if (!strcmp (GFID_XATTR_KEY, name))
> {
>
>
>   1                 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   2                         "Remove xattr called on gfid for file
%s",
> real_path);
>   3                 op_ret = -1;
>
>   4                 goto out;
>
>   5         }
>
>   6         if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
> {
>   7                 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   8                         "Remove xattr called on volume-id for file
> %s",
>   9                         real_path);
>
>  10                 op_ret = -1;
>
>  11                 goto out;
>
>  12         }
>
> I just found that op_errno is not set correctly, but it can't happen in
> the I/O path, so self-heal/rebalance are off the hook.
>
> I also grepped for any removexattr of trusted.gfid from glusterd and
> didn't find any.
>
> So one thing that used to happen was that sometimes when machines reboot,
> the brick mounts wouldn't happen and this would lead to absence of both
> trusted.gfid and volume-id. So at the moment this is my wild guess.
>
Fix for this was to mount the bricks. But considering that you guys did
setting of the xattrs instead, I am guessing the other data was intact and
only these particular xattrs were missing? I wonder what new problem this
is.

>
>
> On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy <areddy at
commvault.com
> > wrote:
>
>> Hi,
>>
>>        We faced an issue in the production today. We had to stop the
>> volume and reboot all the servers in the cluster.  Once the servers
>> rebooted starting of the volume failed because the following extended
>> attributes were not present on all the bricks on 2 servers.
>>
>> 1)      trusted.gfid
>>
>> 2)      trusted.glusterfs.volume-id
>>
>>
>>
>> We had to manually set these extended attributes to start the volume.
>> Are there any such known issues.
>>
>>
>>
>> Thanks and Regards,
>>
>> Ram
>> ***************************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged
material for
>> the
>> sole use of the intended recipient. Any unauthorized review, use or
>> distribution
>> by others is strictly prohibited. If you have received the message by
>> mistake,
>> please advise the sender by reply email and delete the message. Thank
>> you."
>> **********************************************************************
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Pranith
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/eb14200e/attachment.html>

Ankireddypalle Reddy

2017-Jul-07 15:50 UTC

head link

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

Pranith,
                 Thanks for looking in to the issue. The bricks were mounted
after the reboot. One more thing that I noticed was when the attributes were
manually set when glusterd was up then on starting the volume the attributes
were again lost. Had to stop glusterd set attributes and then start glusterd.
After that the volume start succeeded.

Thanks and Regards,
Ram

From: Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
Sent: Friday, July 07, 2017 11:46 AM
To: Ankireddypalle Reddy
Cc: Gluster Devel (gluster-devel at gluster.org); gluster-users at gluster.org
Subject: Re: [Gluster-devel] gfid and volume-id extended attributes lost

Did anything special happen on these two bricks? It can't happen in the I/O
path:
posix_removexattr() has:
  0         if (!strcmp (GFID_XATTR_KEY, name)) {
  1                 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  2                         "Remove xattr called on gfid for file %s",
real_path);
  3                 op_ret = -1;
  4                 goto out;
  5         }
  6         if (!strcmp (GF_XATTR_VOL_ID_KEY, name)) {
  7                 gf_msg (this->name, GF_LOG_WARNING, 0,
P_MSG_XATTR_NOT_REMOVED,
  8                         "Remove xattr called on volume-id for file
%s",
  9                         real_path);
 10                 op_ret = -1;
 11                 goto out;
 12         }
I just found that op_errno is not set correctly, but it can't happen in the
I/O path, so self-heal/rebalance are off the hook.
I also grepped for any removexattr of trusted.gfid from glusterd and didn't
find any.
So one thing that used to happen was that sometimes when machines reboot, the
brick mounts wouldn't happen and this would lead to absence of both
trusted.gfid and volume-id. So at the moment this is my wild guess.


On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy <areddy at
commvault.com<mailto:areddy at commvault.com>> wrote:
Hi,
       We faced an issue in the production today. We had to stop the volume and
reboot all the servers in the cluster.  Once the servers rebooted starting of
the volume failed because the following extended attributes were not present on
all the bricks on 2 servers.

1)      trusted.gfid

2)      trusted.glusterfs.volume-id

We had to manually set these extended attributes to start the volume.  Are there
any such known issues.

Thanks and Regards,
Ram
***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for
the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************

_______________________________________________
Gluster-devel mailing list
Gluster-devel at gluster.org<mailto:Gluster-devel at gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-devel



--
Pranith
***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for
the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/31044b10/attachment.html>

Pranith Kumar Karampuri

2017-Jul-07 15:53 UTC

head link

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

On Fri, Jul 7, 2017 at 9:20 PM, Ankireddypalle Reddy <areddy at
commvault.com>
wrote:
> Pranith,
>
>                  Thanks for looking in to the issue. The bricks were
> mounted after the reboot. One more thing that I noticed was when the
> attributes were manually set when glusterd was up then on starting the
> volume the attributes were again lost. Had to stop glusterd set attributes
> and then start glusterd. After that the volume start succeeded.
>
Which version is this?

>
>
> Thanks and Regards,
>
> Ram
>
>
>
> *From:* Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
> *Sent:* Friday, July 07, 2017 11:46 AM
> *To:* Ankireddypalle Reddy
> *Cc:* Gluster Devel (gluster-devel at gluster.org); gluster-users at
gluster.org
> *Subject:* Re: [Gluster-devel] gfid and volume-id extended attributes lost
>
>
>
> Did anything special happen on these two bricks? It can't happen in the
> I/O path:
> posix_removexattr() has:
>   0         if (!strcmp (GFID_XATTR_KEY, name))
> {
>
>
>   1                 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   2                         "Remove xattr called on gfid for file
%s",
> real_path);
>   3                 op_ret = -1;
>
>   4                 goto out;
>
>   5         }
>
>   6         if (!strcmp (GF_XATTR_VOL_ID_KEY, name))
> {
>   7                 gf_msg (this->name, GF_LOG_WARNING, 0,
> P_MSG_XATTR_NOT_REMOVED,
>   8                         "Remove xattr called on volume-id for file
> %s",
>   9                         real_path);
>
>  10                 op_ret = -1;
>
>  11                 goto out;
>
>  12         }
>
> I just found that op_errno is not set correctly, but it can't happen in
> the I/O path, so self-heal/rebalance are off the hook.
>
> I also grepped for any removexattr of trusted.gfid from glusterd and
> didn't find any.
>
> So one thing that used to happen was that sometimes when machines reboot,
> the brick mounts wouldn't happen and this would lead to absence of both
> trusted.gfid and volume-id. So at the moment this is my wild guess.
>
>
>
>
>
> On Fri, Jul 7, 2017 at 8:39 PM, Ankireddypalle Reddy <areddy at
commvault.com>
> wrote:
>
> Hi,
>
>        We faced an issue in the production today. We had to stop the
> volume and reboot all the servers in the cluster.  Once the servers
> rebooted starting of the volume failed because the following extended
> attributes were not present on all the bricks on 2 servers.
>
> 1)      trusted.gfid
>
> 2)      trusted.glusterfs.volume-id
>
>
>
> We had to manually set these extended attributes to start the volume.  Are
> there any such known issues.
>
>
>
> Thanks and Regards,
>
> Ram
>
> ***************************Legal Disclaimer***************************
>
> "This communication may contain confidential and privileged material
for
> the
>
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
>
> by others is strictly prohibited. If you have received the message by
> mistake,
>
> please advise the sender by reply email and delete the message. Thank
you."
>
> **********************************************************************
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
>
> Pranith
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank
you."
> **********************************************************************
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170707/6155a041/attachment.html>

Reasonably Related Threads

Search for more maybe matching threads

Gluster users - Jul 2017 - [Gluster-devel] gfid and volume-id extended attributes lost

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

[Gluster-users] [Gluster-devel] gfid and volume-id extended attributes lost

Reasonably Related Threads