thr3ads.net - Gluster users - [Gluster-users] how to check/fix underlaying partition error? [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Pavel Riha

2015-Apr-15 15:05 UTC

[Gluster-users] how to check/fix underlaying partition error?

Thank you for your reply.

but btw what is the right way to do this?
stoping the glusterd service does not stop the glustefsd daemons itself
https://bugzilla.redhat.com/show_bug.cgi?id=988946

and I have more volumes running, but only one with this problem.
I haven't found any official way how to stop the process, so I just 
KILLed them.
It worked.. partiton repaired, seems ok for now.

But how to run the brick again??
I didn't save the cmdline showed in ps, but it was crazy. As I see the 
other running .. there are crazy numbers (uuid, socked, port)
and the port (for ex) is not the same as on the other server...

so I restarted the glusterd service .. nothing happend .. I was hopeless
.. but after a while I recognized, that the process is running, so maybe 
the glusterd started it after a while

there should be some way to stop or at least start one brick

Pavel

On 15.4.2015 11:59, Sander Zijlstra wrote:> Hi Pavel,
>
> you can simply stop the glusterd service and run the fsck, it's similar
to rebooting a server which is part of a replicated volume. If all is ok before
you can simply take down one of the two and once it comes back online it will be
heal each file which hasn't been copied allready.
>
> Do take care of any client which has the volume mounted using the server
you take down; that will loose connection also.
>
> Met vriendelijke groet / kind regards,
>
> Sander Zijlstra
>
> Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam |
> +31 (0)6 43 99 12 47 | sander.zijlstra at surfsara.nl | www.surfsara.nl |
>
> ----- Original Message -----
> From: "Pavel Riha" <pavel.riha at trilogic.cz>
> To: gluster-users at gluster.org
> Sent: Wednesday, 15 April, 2015 10:28:50
> Subject: [Gluster-users] how to check/fix underlaying partition error?
>
> Hi guys,
>
> I have replicated glusterfs (v3.4.2) on two server and I found logs
> filled by IO error on one server only. But in /var/log/messages is no hw
> error, only XFS error, so I gues the filesystem could be corrupted
>
> My question is, how to stop or pause this brick and run fsck ?
>   From the replicate feature I'm expecting no need to stop the gluster
> volume (there are some xen VM running)
>
> what is the right way to do it? with the later re-adding and fast
> rebuild/sync in mind..
>
> thank for tips
>
> Pavel
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Jiri Hoogeveen

2015-Apr-16 06:49 UTC

head link

[Gluster-users] how to check/fix underlaying partition error?

Hi Pavel,

killing the brick proces, is the way to go.
This way, all other bricks on that server, will keep working.
After you replace/fix the disk,  

A restart of the glusterd proces should me should be enough, to get the brick
back online. (self-healing scan, can take some IO)

Do you have some logs, about the brick that would not start?

Btw, IO error on XFS? Did you lose some files from brick/.glusterfs, which can
explain why the brick will not start up.

Grtz, Jiri

> On 15 Apr 2015, at 17:05, Pavel Riha <pavel.riha at trilogic.cz>
wrote:
> 
> Thank you for your reply.
> 
> but btw what is the right way to do this?
> stoping the glusterd service does not stop the glustefsd daemons itself
> https://bugzilla.redhat.com/show_bug.cgi?id=988946
> 
> and I have more volumes running, but only one with this problem.
> I haven't found any official way how to stop the process, so I just
KILLed them.
> It worked.. partiton repaired, seems ok for now.
> 
> But how to run the brick again??
> I didn't save the cmdline showed in ps, but it was crazy. As I see the
other running .. there are crazy numbers (uuid, socked, port)
> and the port (for ex) is not the same as on the other server...
> 
> so I restarted the glusterd service .. nothing happend .. I was hopeless
> .. but after a while I recognized, that the process is running, so maybe
the glusterd started it after a while
> 
> there should be some way to stop or at least start one brick
> 
> 
> 
> Pavel
> 
> 
> 
> On 15.4.2015 11:59, Sander Zijlstra wrote:
>> Hi Pavel,
>> 
>> you can simply stop the glusterd service and run the fsck, it's
similar to rebooting a server which is part of a replicated volume. If all is ok
before you can simply take down one of the two and once it comes back online it
will be heal each file which hasn't been copied allready.
>> 
>> Do take care of any client which has the volume mounted using the
server you take down; that will loose connection also.
>> 
>> Met vriendelijke groet / kind regards,
>> 
>> Sander Zijlstra
>> 
>> Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam |
>> +31 (0)6 43 99 12 47 | sander.zijlstra at surfsara.nl | www.surfsara.nl
|
>> 
>> ----- Original Message -----
>> From: "Pavel Riha" <pavel.riha at trilogic.cz>
>> To: gluster-users at gluster.org
>> Sent: Wednesday, 15 April, 2015 10:28:50
>> Subject: [Gluster-users] how to check/fix underlaying partition error?
>> 
>> Hi guys,
>> 
>> I have replicated glusterfs (v3.4.2) on two server and I found logs
>> filled by IO error on one server only. But in /var/log/messages is no
hw
>> error, only XFS error, so I gues the filesystem could be corrupted
>> 
>> My question is, how to stop or pause this brick and run fsck ?
>>  From the replicate feature I'm expecting no need to stop the
gluster
>> volume (there are some xen VM running)
>> 
>> what is the right way to do it? with the later re-adding and fast
>> rebuild/sync in mind..
>> 
>> thank for tips
>> 
>> Pavel
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Apr 2015 - how to check/fix underlaying partition error?

[Gluster-users] how to check/fix underlaying partition error?

[Gluster-users] how to check/fix underlaying partition error?