Hello How do you keep track of the health status of your Gluster volumes? When Brick went down (crash, failure, shutdown), node failure, peering issue, on-going healing? Gluster Tendrl is complex and sometimes it's broken, Prometheus exporter still lacking, gstatus is basic. Currently, to monitor a Gluster volume, a custom script should be used to gather whatever info needed for monitoring or a combination of the mentioned tools. Can Gluster have something similar to Ceph and display the health of the entire cluster? I know Ceph uses it?s ?Monitors? to keep track of everything going inside the cluster, but Gluster should also have a way to keep track of the cluster?s health. How?s the community experience with Gluster monitoring? How are you managing and tracking alerts and issues? Any recommendations? Thank you. -- Respectfully Mahdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/aa331c27/attachment.html>
https://github.com/gluster/gstatus we run this from an ansible driven cronjob and check for the healthy signal in status, as well as looking for healing files that seem to persist. We have a number of gluster clusters and we have found its warnings both useful and timely. -wk On 10/26/2020 9:25 PM, Mahdi Adnan wrote:> > Hello > > ?How do you keep track of the health status of your Gluster volumes? > When Brick went down (crash, failure, shutdown), node failure, peering > issue, on-going healing? > > > Gluster Tendrl is complex and sometimes it's broken, Prometheus > exporter still lacking, gstatus is basic. > > Currently, to monitor a Gluster volume, a custom script should be used > to gather whatever info needed for monitoring or a combination of the > mentioned tools. > > > Can Gluster have something similar to Ceph and display the health of > the entire cluster? I know Ceph uses it?s ?Monitors? to keep track of > everything going inside the cluster, but Gluster should also have a > way to keep track of the cluster?s health. > > > How?s the community experience with Gluster monitoring? How are you > managing and tracking alerts and issues? Any recommendations? > > > Thank you. > > > -- > Respectfully > Mahdi > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/9699485e/attachment.html>
We have been using zabbix for tracking gluster but? that works because we are using zabbix for the rest of our monitoring of things like network and disk IO. One thing to track that is not part of the usual suspects is the heal counts. They should always be 0 unless you have a problem somewhere. On 10/27/20 12:25 AM, Mahdi Adnan wrote:> > Hello > > ?How do you keep track of the health status of your Gluster volumes? > When Brick went down (crash, failure, shutdown), node failure, peering > issue, on-going healing? > > > Gluster Tendrl is complex and sometimes it's broken, Prometheus > exporter still lacking, gstatus is basic. > > Currently, to monitor a Gluster volume, a custom script should be used > to gather whatever info needed for monitoring or a combination of the > mentioned tools. > > > Can Gluster have something similar to Ceph and display the health of > the entire cluster? I know Ceph uses it?s ?Monitors? to keep track of > everything going inside the cluster, but Gluster should also have a > way to keep track of the cluster?s health. > > > How?s the community experience with Gluster monitoring? How are you > managing and tracking alerts and issues? Any recommendations? > > > Thank you. > > > -- > Respectfully > Mahdi > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 alvin at netvel.net || -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/79697cdb/attachment.html>
Hello, We also looked into tendrl some time ago, but in an enterprise environment it simply can not be used (talking about 40+ gluster clusters), and it indeed randomly fails without proper ways to 'get it up' again. Apart from regular process monitoring we also make use of a collectd plugin for gluster which works decent enough (https://github.com/gluster/gluster-collectd) and allows for gluster metrics, along-side system metrics to be monitored. We don't even yet use all the metrics exposed by that collectd plugin, but as a sample dashboard we currently use something like: Regards, Nico van Roijen Van: "Alvin Starr" <alvin at netvel.net> Aan: "gluster-users" <gluster-users at gluster.org> Verzonden: Dinsdag 27 oktober 2020 18:08:03 Onderwerp: Re: [Gluster-users] Gluster monitoring We have been using zabbix for tracking gluster but that works because we are using zabbix for the rest of our monitoring of things like network and disk IO. One thing to track that is not part of the usual suspects is the heal counts. They should always be 0 unless you have a problem somewhere. On 10/27/20 12:25 AM, Mahdi Adnan wrote: Hello How do you keep track of the health status of your Gluster volumes? When Brick went down (crash, failure, shutdown), node failure, peering issue, on-going healing? Gluster Tendrl is complex and sometimes it's broken, Prometheus exporter still lacking, gstatus is basic. Currently, to monitor a Gluster volume, a custom script should be used to gather whatever info needed for monitoring or a combination of the mentioned tools. Can Gluster have something similar to Ceph and display the health of the entire cluster? I know Ceph uses it?s ?Monitors? to keep track of everything going inside the cluster, but Gluster should also have a way to keep track of the cluster?s health. How?s the community experience with Gluster monitoring? How are you managing and tracking alerts and issues? Any recommendations? Thank you. -- Respectfully Mahdi ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: [ https://bluejeans.com/441850968 | https://bluejeans.com/441850968 ] Gluster-users mailing list [ mailto:Gluster-users at gluster.org | Gluster-users at gluster.org ] [ https://lists.gluster.org/mailman/listinfo/gluster-users | https://lists.gluster.org/mailman/listinfo/gluster-users ] -- Alvin Starr || land: (647)478-6285 Netvel Inc. || Cell: (416)806-0133 [ mailto:alvin at netvel.net | alvin at netvel.net ] || ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201103/47c96650/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 364525.90500000044 Type: image/png Size: 207993 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201103/47c96650/attachment.png>