Karthik Subrahmanya
2018-Mar-14 12:12 UTC
[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."
On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote:> Hi Karthik, > > > Thanks a lot for the explanation. > > Does it mean a distributed volume health can be checked only by "gluster > volume status " command? >Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to the heal info command for replicate/disperse volumes.> And one more question: cluster.min-free-disk is 10% by default. What kind > of "side effects" can we face if this option will be reduced to, for > example, 5%? Could you point to any best practice document(s)? >Yes you can decrease it to any value. There won't be any side effect. Regards, Karthik> > Regards, > > Anatoliy > > > > > > On 2018-03-13 16:46, Karthik Subrahmanya wrote: > > Hi Anatoliy, > > The heal command is basically used to heal any mismatching contents > between replica copies of the files. > For the command "gluster volume heal <volname>" to succeed, you should > have the self-heal-daemon running, > which is true only if your volume is of type replicate/disperse. > In your case you have a plain distribute volume where you do not store the > replica of any files. > So the volume heal will return you the error. > > Regards, > Karthik > > On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> > wrote: > >> Hi, >> >> >> Maybe someone can point me to a documentation or explain this? I can't >> find it myself. >> Do we have any other useful resources except doc.gluster.org? As I see >> many gluster options are not described there or there are no explanation >> what is doing... >> >> >> >> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: >> >>> Hello, >>> >>> We have a very fresh gluster 3.10.10 installation. >>> Our volume is created as distributed volume, 9 bricks 96TB in total >>> (87TB after 10% of gluster disk space reservation) >>> >>> For some reasons I can't "heal" the volume: >>> # gluster volume heal gv0 >>> Launching heal operation to perform index self heal on volume gv0 has >>> been unsuccessful on bricks that are down. Please check if all brick >>> processes are running. >>> >>> Which processes should be run on every brick for heal operation? >>> >>> # gluster volume status >>> Status of volume: gv0 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> ------------------------------------------------------------ >>> ------------------ >>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 70850 >>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 102951 >>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 57535 >>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 56676 >>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 56880 >>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 56889 >>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 56902 >>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 94920 >>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y >>> 56542 >>> >>> Task Status of Volume gv0 >>> ------------------------------------------------------------ >>> ------------------ >>> There are no active volume tasks >>> >>> >>> # gluster volume info gv0 >>> Volume Name: gv0 >>> Type: Distribute >>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 9 >>> Transport-type: rdma >>> Bricks: >>> Brick1: cn01-ib:/gfs/gv0/brick1/brick >>> Brick2: cn02-ib:/gfs/gv0/brick1/brick >>> Brick3: cn03-ib:/gfs/gv0/brick1/brick >>> Brick4: cn04-ib:/gfs/gv0/brick1/brick >>> Brick5: cn05-ib:/gfs/gv0/brick1/brick >>> Brick6: cn06-ib:/gfs/gv0/brick1/brick >>> Brick7: cn07-ib:/gfs/gv0/brick1/brick >>> Brick8: cn08-ib:/gfs/gv0/brick1/brick >>> Brick9: cn09-ib:/gfs/gv0/brick1/brick >>> Options Reconfigured: >>> client.event-threads: 8 >>> performance.parallel-readdir: on >>> performance.readdir-ahead: on >>> cluster.nufa: on >>> nfs.disable: on >> >> >> -- >> Best regards, >> Anatoliy >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > -- > Best regards, > Anatoliy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180314/15d51d63/attachment.html>
Karthik Subrahmanya
2018-Mar-14 12:50 UTC
[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."
On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com> wrote:> > > On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> > wrote: > >> Hi Karthik, >> >> >> Thanks a lot for the explanation. >> >> Does it mean a distributed volume health can be checked only by "gluster >> volume status " command? >> > Yes. I am not aware of any other command which can give the status of > plain distribute volume which is similar to the heal info command for > replicate/disperse volumes. > >> And one more question: cluster.min-free-disk is 10% by default. What kind >> of "side effects" can we face if this option will be reduced to, for >> example, 5%? Could you point to any best practice document(s)? >> > Yes you can decrease it to any value. There won't be any side effect. >Small correction here, min-free-disk should ideally be set to larger than the largest file size likely to be written. Decreasing it beyond a point raises the likelihood of the brick getting full which is a very bad state to be in. Will update you if I get some document which explains this thing. Sorry for the previous statement.> > Regards, > Karthik > >> >> Regards, >> >> Anatoliy >> >> >> >> >> >> On 2018-03-13 16:46, Karthik Subrahmanya wrote: >> >> Hi Anatoliy, >> >> The heal command is basically used to heal any mismatching contents >> between replica copies of the files. >> For the command "gluster volume heal <volname>" to succeed, you should >> have the self-heal-daemon running, >> which is true only if your volume is of type replicate/disperse. >> In your case you have a plain distribute volume where you do not store >> the replica of any files. >> So the volume heal will return you the error. >> >> Regards, >> Karthik >> >> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> >> wrote: >> >>> Hi, >>> >>> >>> Maybe someone can point me to a documentation or explain this? I can't >>> find it myself. >>> Do we have any other useful resources except doc.gluster.org? As I see >>> many gluster options are not described there or there are no explanation >>> what is doing... >>> >>> >>> >>> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: >>> >>>> Hello, >>>> >>>> We have a very fresh gluster 3.10.10 installation. >>>> Our volume is created as distributed volume, 9 bricks 96TB in total >>>> (87TB after 10% of gluster disk space reservation) >>>> >>>> For some reasons I can't "heal" the volume: >>>> # gluster volume heal gv0 >>>> Launching heal operation to perform index self heal on volume gv0 has >>>> been unsuccessful on bricks that are down. Please check if all brick >>>> processes are running. >>>> >>>> Which processes should be run on every brick for heal operation? >>>> >>>> # gluster volume status >>>> Status of volume: gv0 >>>> Gluster process TCP Port RDMA Port >>>> Online Pid >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 70850 >>>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 102951 >>>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 57535 >>>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 56676 >>>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 56880 >>>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 56889 >>>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 56902 >>>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 94920 >>>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y >>>> 56542 >>>> >>>> Task Status of Volume gv0 >>>> ------------------------------------------------------------ >>>> ------------------ >>>> There are no active volume tasks >>>> >>>> >>>> # gluster volume info gv0 >>>> Volume Name: gv0 >>>> Type: Distribute >>>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 9 >>>> Transport-type: rdma >>>> Bricks: >>>> Brick1: cn01-ib:/gfs/gv0/brick1/brick >>>> Brick2: cn02-ib:/gfs/gv0/brick1/brick >>>> Brick3: cn03-ib:/gfs/gv0/brick1/brick >>>> Brick4: cn04-ib:/gfs/gv0/brick1/brick >>>> Brick5: cn05-ib:/gfs/gv0/brick1/brick >>>> Brick6: cn06-ib:/gfs/gv0/brick1/brick >>>> Brick7: cn07-ib:/gfs/gv0/brick1/brick >>>> Brick8: cn08-ib:/gfs/gv0/brick1/brick >>>> Brick9: cn09-ib:/gfs/gv0/brick1/brick >>>> Options Reconfigured: >>>> client.event-threads: 8 >>>> performance.parallel-readdir: on >>>> performance.readdir-ahead: on >>>> cluster.nufa: on >>>> nfs.disable: on >>> >>> >>> -- >>> Best regards, >>> Anatoliy >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> -- >> Best regards, >> Anatoliy >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180314/aab3d700/attachment.html>
Anatoliy Dmytriyev
2018-Mar-14 13:22 UTC
[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."
Thanks On 2018-03-14 13:50, Karthik Subrahmanya wrote:> On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com> wrote: > > On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote: > > Hi Karthik, > > Thanks a lot for the explanation. > > Does it mean a distributed volume health can be checked only by "gluster volume status " command? > Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to the heal info command for replicate/disperse volumes. > > And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)? > Yes you can decrease it to any value. There won't be any side effect.Small correction here, min-free-disk should ideally be set to larger than the largest file size likely to be written. Decreasing it beyond a point raises the likelihood of the brick getting full which is a very bad state to be in. Will update you if I get some document which explains this thing. Sorry for the previous statement.> Regards, > Karthik > > Regards, > > Anatoliy > > On 2018-03-13 16:46, Karthik Subrahmanya wrote: > > Hi Anatoliy, > > The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running, > which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any files. So the volume heal will return you the error. > > Regards, Karthik > > On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote: > Hi, > > Maybe someone can point me to a documentation or explain this? I can't find it myself. > Do we have any other useful resources except doc.gluster.org [1]? As I see many gluster options are not described there or there are no explanation what is doing... > > On 2018-03-12 15:58, Anatoliy Dmytriyev wrote: > Hello, > > We have a very fresh gluster 3.10.10 installation. > Our volume is created as distributed volume, 9 bricks 96TB in total > (87TB after 10% of gluster disk space reservation) > > For some reasons I can't "heal" the volume: > # gluster volume heal gv0 > Launching heal operation to perform index self heal on volume gv0 has > been unsuccessful on bricks that are down. Please check if all brick > processes are running. > > Which processes should be run on every brick for heal operation? > > # gluster volume status > Status of volume: gv0 > Gluster process TCP Port RDMA Port Online Pid > ------------------------------------------------------------------------------ > Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152 Y 70850 > Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152 Y 102951 > Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152 Y 57535 > Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152 Y 56676 > Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152 Y 56880 > Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152 Y 56889 > Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152 Y 56902 > Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152 Y 94920 > Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152 Y 56542 > > Task Status of Volume gv0 > ------------------------------------------------------------------------------ > There are no active volume tasks > > # gluster volume info gv0 > Volume Name: gv0 > Type: Distribute > Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f > Status: Started > Snapshot Count: 0 > Number of Bricks: 9 > Transport-type: rdma > Bricks: > Brick1: cn01-ib:/gfs/gv0/brick1/brick > Brick2: cn02-ib:/gfs/gv0/brick1/brick > Brick3: cn03-ib:/gfs/gv0/brick1/brick > Brick4: cn04-ib:/gfs/gv0/brick1/brick > Brick5: cn05-ib:/gfs/gv0/brick1/brick > Brick6: cn06-ib:/gfs/gv0/brick1/brick > Brick7: cn07-ib:/gfs/gv0/brick1/brick > Brick8: cn08-ib:/gfs/gv0/brick1/brick > Brick9: cn09-ib:/gfs/gv0/brick1/brick > Options Reconfigured: > client.event-threads: 8 > performance.parallel-readdir: on > performance.readdir-ahead: on > cluster.nufa: on > nfs.disable: on > -- > Best regards, > Anatoliy > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users [2]-- Best regards, Anatoliy -- Best regards, Anatoliy Links: ------ [1] http://doc.gluster.org [2] http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180314/129a8c9b/attachment.html>
Reasonably Related Threads
- Can't heal a volume: "Please check if all brick processes are running."
- Can't heal a volume: "Please check if all brick processes are running."
- Can't heal a volume: "Please check if all brick processes are running."
- Can't heal a volume: "Please check if all brick processes are running."
- Can't heal a volume: "Please check if all brick processes are running."