Raghavendra Bhat
2014-Jun-04 08:54 UTC
[Gluster-users] [Gluster-devel] autodelete in snapshots
On Wednesday 04 June 2014 11:23 AM, Rajesh Joseph wrote:> > ----- Original Message ----- >> From: "M S Vishwanath Bhat" <msvbhat at gmail.com> >> To: "Rajesh Joseph" <rjoseph at redhat.com> >> Cc: "Vijay Bellur" <vbellur at redhat.com>, "Seema Naik" <senaik at redhat.com>, "Gluster Devel" >> <gluster-devel at gluster.org> >> Sent: Tuesday, June 3, 2014 5:55:27 PM >> Subject: Re: [Gluster-devel] autodelete in snapshots >> >> On 3 June 2014 15:21, Rajesh Joseph <rjoseph at redhat.com> wrote: >> >>> >>> ----- Original Message ----- >>> From: "M S Vishwanath Bhat" <msvbhat at gmail.com> >>> To: "Vijay Bellur" <vbellur at redhat.com> >>> Cc: "Seema Naik" <senaik at redhat.com>, "Gluster Devel" < >>> gluster-devel at gluster.org> >>> Sent: Tuesday, June 3, 2014 1:02:08 AM >>> Subject: Re: [Gluster-devel] autodelete in snapshots >>> >>> >>> >>> >>> On 2 June 2014 20:22, Vijay Bellur < vbellur at redhat.com > wrote: >>> >>> >>> >>> On 04/23/2014 05:50 AM, Vijay Bellur wrote: >>> >>> >>> On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote: >>> >>> >>> On 04/16/2014 11:39 AM, Avra Sengupta wrote: >>> >>> >>> The whole purpose of introducing the soft-limit is, that at any point >>> of time the number of >>> snaps should not exceed the hard limit. If we trigger auto-delete on >>> hitting hard-limit, then >>> the purpose itself is lost, because at that point we would be taking a >>> snap, making the limit >>> hard-limit + 1, and then triggering auto-delete, which violates the >>> sanctity of the hard-limit. >>> Also what happens when we are at hard-limit + 1, and another snap is >>> issued, while auto-delete >>> is yet to process the first delete. At that point we end up at >>> hard-limit + 1. Also what happens >>> if for a particular snap the auto-delete fails. >>> >>> We should see the hard-limit, as something set by the admin keeping in >>> mind the resource consumption >>> and at no-point should we cross this limit, come what may. If we hit >>> this limit, the create command >>> should fail asking the user to delete snaps using the "snapshot >>> delete" command. >>> >>> The two options Raghavendra mentioned are applicable for the >>> soft-limit only, in which cases on >>> hitting the soft-limit >>> >>> 1. Trigger auto-delete >>> >>> or >>> >>> 2. Log a warning-message, for the user saying the number of snaps is >>> exceeding the snap-limit and >>> display the number of available snaps >>> >>> Now which of these should happen also depends on the user, because the >>> auto-delete option >>> is configurable. >>> >>> So if the auto-delete option is set as true, auto-delete should be >>> triggered and the above message >>> should also be logged. >>> >>> But if the option is set as false, only the message should be logged. >>> >>> This is the behaviour as designed. Adding Rahul, and Seema in the >>> mail, to reflect upon the >>> behaviour as well. >>> >>> Regards, >>> Avra >>> >>> This sounds correct. However we need to make sure that the usage or >>> documentation around this should be good enough , so that users >>> understand the each of the limits correctly. >>> >>> >>> It might be better to avoid the usage of the term "soft-limit". >>> soft-limit as used in quota and other places generally has an alerting >>> connotation. Something like "auto-deletion-limit" might be better. >>> >>> >>> I still see references to "soft-limit" and auto deletion seems to get >>> triggered upon reaching soft-limit. >>> >>> Why is the ability to auto delete not configurable? It does seem pretty >>> nasty to go about deleting snapshots without obtaining explicit consent >>> from the user. >>> >>> I agree with Vijay here. It's not good to delete a snap (even though it is >>> oldest) without the explicit consent from user. >>> >>> FYI It took me more than 2 weeks to figure out that my snaps were getting >>> autodeleted after reaching "soft-limit". For all I know I had not done >>> anything and my snap restore were failing. >>> >>> I propose to remove the terms "soft" and "hard" limit. I believe there >>> should be a limit (just "limit") after which all snapshot creates should >>> fail with proper error messages. And there can be a water-mark after which >>> user should get warning messages. So below is my proposal. >>> >>> auto-delete + snap-limit: If the snap-limit is set to n , next snap create >>> (n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest >>> snap will get deleted automatically. If autodelete is set to off/false/0 , >>> (n+1)th snap create will fail with proper error message from gluster CLI >>> command. But again by default autodelete should be off. >>> >>> snap-water-mark : This should come in picture only if autodelete is turned >>> off. It should not have any meaning if auto-delete is turned ON. Basically >>> it's usage is to give the user warning that limit almost being reached and >>> it is time for admin to decide which snaps should be deleted (or which >>> should be kept) >>> >>> *my two cents* >>> >>> -MS >>> >>> >>> The reason for having a hard-limit is to stop snapshot creation once we >>> reached this limit. This helps to have a control over the resource >>> consumption. Therefore if we only have this limit (as snap-limit) then >>> there is no question of auto-delete. Auto-delete can only be triggered once >>> the count crosses the limit. Therefore we introduced the concept of >>> soft-limit and a hard-limit. As the name suggests once the hard-limit is >>> reached no more snaps will be created. >>> >> Perhaps I could have been more clearer. auto-delete value does come into >> picture when limit is reached. >> >> There is a limit 'n' (snap-limit), and when we reach this limit, what >> happens to next snap create depends on the value of auto-delete ( should be >> user configurable). If auto-delete is ON (n+1)th snap create will actually >> delete the snap first (oldest or biggest or some other policy driven) and >> then create the next snap. If the auto-delete is set to OFF, then (n+1)th >> snap create will fail. > Sorry I was not clear enough. We cannot delete the snap first because snapshot > creation can fail for n number of reasons. And if the snapshot failed we unnecessarily > delete a snapshot. Therefore always we should take a snapshot before issuing a delete. > Hope this clear things.Yes, This is a concern. I think there is another way to handle this situation. Use *soft-limit* to warn the user for every snap creation upon crossing the soft limit set by user. Upon reaching the *hard-limit* (say 100 is the limit set by the user), when a new snap create command comes in (say 101st snapshot) create the new snapshot and move the oldest snapshot (as of now oldest is the only policy we are following, in future there can be multiple policies) to some trash directory (say /var/lib/glusterd/snaps/.trash). Have a janitor thread running which wakes up at certain time intervals and cleans up the entries present in trash directory. In this way we can ensure that hard limit is not exceeded and also the snapshot is logically deleted. Please provide feedback.> >> Now the Idea of having one more limit (water-mark or threshold-limit or >> something) which is less than snap-limit n is to warn the user that his >> limit is getting nearer. Now the admin will decide what snaps should be >> deleted (or which ones should be kept). The sole purpose of this is to warn >> the admin. >> >> Now 'snap-limit' can be called as hard-limit and water-mark can be called >> as soft-limit. But auto-delete should *NOT* be turned on by default and it >> should not delete upon reaching the soft-limit. It should be the hard-limit > yes we agree that we should not turn this by-default, that's why we are planning to raise > a bug to address this. > > >> -MS >> >> >>> So the idea is to keep the number of snapshots always less than the >>> hard-limit. To do so we introduced the concept of soft-limit, wherein we >>> allow snapshots even when this limit is crossed and once the snapshot is >>> taken we delete the oldest snap. If you consider this definition then the >>> name soft-limit and hard-limit looks ok to me. >>> >>> In phase II we are planning to have auto-delete feature configurable with >>> different policies, e.g. delete oldest, delete with more space consumption, >>> etc. I think it is good to have the auto-delete feature enable & disable >>> with an user controllable option. We will raise a bug to address this. >>> >>> Best Regards, >>> Rajesh >>> > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-devel