thr3ads.net - Gluster users - [Gluster-users] Subject: Help needed in improving monitoring in Gluster [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Maarten van Baarsel

2018-Jul-23 14:54 UTC

[Gluster-users] Subject: Help needed in improving monitoring in Gluster

On 23/07/18 16:03, Pranith Kumar Karampuri wrote:
> We want gluster's monitoring/observability to be as easy as possible 
> going forward. As part of reaching this goal we are starting this 
> initiative to add improvements to existing apis/commands and create new 
> apis/commands to gluster so that the admin can integrate it with 
> whichever monitoring tool he/she likes. The gluster-prometheus project 
> hosted at https://github.com/gluster/gluster-prometheus is the direction 
> in which we feel metrics can be collected and distributed from a Gluster 
> cluster enabling analysis and visualization.
> 
> 
> As a first step we want to hear from you what you feel needs to be 
> addressed.
Regarding monitoring; I would love to see in my monitoring that 
geo-replication is working as intended; at the moment I'm faking georep 
monitoring by having a process touch a file (every server involved in 
gluster touches another file) on every volume and checking mtime on the 
slave.

However, I discovered that this is not foolproof: if the georep run 
stops for whatever reason the mtime of the monitored file is being kept
updated, probably because it's updated to often, but the georep is not 
complete.

I've also seen that a crashed glusterd escapes this monitoring.

What would also be fun is some kind of monitoring where you can find out 
why gluster is running at X MB/sec where Y MB/sec is expected (bit large 
target, that)

I've once tried monitoring 'gluster volume status all' output but
that
only works if everything is OK; with some network problems you can wait 
for hours for output which then causes more problems.

Also, I've checked the example output at 
https://github.com/gluster/gluster-prometheus:

would JSON or something like that be more friendy to parse instead of 
the "[parameter] { [details] } [number]" format?

thanks,
Maarten.

Sankarshan Mukhopadhyay

2018-Jul-23 16:25 UTC

head link

[Gluster-users] Subject: Help needed in improving monitoring in Gluster

On Mon, Jul 23, 2018 at 8:24 PM, Maarten van Baarsel
<mrten_glusterusers at ii.nl> wrote:> On 23/07/18 16:03, Pranith Kumar Karampuri wrote:
>
>> We want gluster's monitoring/observability to be as easy as
possible going
>> forward. As part of reaching this goal we are starting this initiative
to
>> add improvements to existing apis/commands and create new apis/commands
to
>> gluster so that the admin can integrate it with whichever monitoring
tool
>> he/she likes. The gluster-prometheus project hosted at
>> https://github.com/gluster/gluster-prometheus is the direction in which
we
>> feel metrics can be collected and distributed from a Gluster cluster
>> enabling analysis and visualization.
>>
>>
>> As a first step we want to hear from you what you feel needs to be
>> addressed.
>
>
> Regarding monitoring; I would love to see in my monitoring that
> geo-replication is working as intended; at the moment I'm faking georep
> monitoring by having a process touch a file (every server involved in
> gluster touches another file) on every volume and checking mtime on the
> slave.
>
I'd like to request that if possible, you elaborate on how you'd like
to see the "as intended" situation. What kind of data points and/or
visualization would aid you in arriving at that conclusion?

Gluster users - Jul 2018 - Subject: Help needed in improving monitoring in Gluster

[Gluster-users] Subject: Help needed in improving monitoring in Gluster

[Gluster-users] Subject: Help needed in improving monitoring in Gluster