Sanoj Unnikrishnan
2017-Apr-24 14:10 UTC
[Gluster-users] Revisiting Quota functionality in GlusterFS
Hi All, Considering we are coming out with major release plan, we would like to revisit the QUOTA feature to decide its path forward. I have been working on quota feature for a couple of months now and have come across various issues from performance, usability and correctness perspective. We do have some initial thoughts on how these can be solved. But, to ensure we work on the most important things first, we would like to get a pulse of this from the users of quota feature. Below is a questionnaire for the same. In order to put the questions in perspective, I have provided rationale, external links to alternative design thoughts. The focus of this mail thread though is to get a user pulse than being a design review. Please comment on design in the github issue itself. We can bring the discussion to gluster-devel as they gain traction. We would like the design discussion to be driven by the generated user feedback. 1) On how many directories do u generally have quota limits configured. How often do we see quota limits added/removed/changed. [numbers would help here than qualitative answers] Any use case with large number of quota limits (> 100 on a single volume say)? Is a filesystem Crawl acceptable each time a new quota limit is set? (crawl may take time equivalent to du command, this would essentially introduce a delay for a limit to take effect after it is set) Rationale: Currently, we account the usage of all directories. A performance issue with this approach is we need to recursively account the usage of a file/directory on all its ancestors along the path. If we consider few directories have limits configured, we could explore alternatives where accounting information is maintained only in directories that have limits set [RFC1]. 2) How strict accounting is generally acceptable. Is it acceptable if there is an overshoot by 100MB say? What are the general value of limits configured. Does anybody set limits in the order of MBs ? Rationale: Relaxing the accounting could be another way to gain in performance. We can batch / cache xattr updates. [RFC2] 3) Does directory quota suit your needs? Would you prefer if it works like XFS directory quota? What are the back-end Filesystem you expect to use quota with? Is it acceptable to take the route of leveraging backend FS quota with support for limited FS (or just xfs)? Rationale: Behavior of directory quota in GlusterFS largely differs from the XFS way. both has its pros and cons. GlusterFS will allow you to set a limit on /home and /home/user1. So if you write to /home/user1/file1 the limit for both its ancestors are checked and honored. An admin who has to give storage to 50 users can configure /home to 1 TB and each user home to 25GB say (Hoping that not all users would simultaneously use up their 25 GB). This may not make sense for a cloud storage but it does make sense in a university lab kind of setting. The same cannot be done with XFS beacuse XFS directory quota relies on project id. Only one project id is associated with a directory so only 1 limit can be honored at any time. XFS directory quota has its advantages though. Please find details in [RFC3] 4) Do you use quota with large number of bricks in a volume? (How many?) Do you have quota with large number of volumes hosted on a node? (How many?) Have you seen any performance issues in such setting? Rationale: We have a single quotad process (aggregator of accounting info) on each nodes in the Trusted storage pool. All the bricks (even from different volumes) hosted on a node share the quotad. Another issue in this design is large number of bricks within a volume will increase IPC latency during aggregation. One way of mitigating this is by changing quotad and quota layer to work on a lease based approach [RFC4] 5) If you set directory level quota, do you expect to have hard link across the directories? or want to support rename() across directories? Rationale: Supporting these operations consistently does add complexities in design. XFS itself doesn't support these two operations when quota is set on it. 6) Do u use inode-quota feature? Any user looking for the user/group quota feature? 7) Are you using current quota feature? If yes, are you happy? if yes, and not happy? what are the things you would like to see improve? RFC1 - Alternative Accounting method in marker ( https://github.com/gluster/glusterfs/issues/182) RFC2 - Batched updates (https://github.com/gluster/glusterfs/issues/183) RFC3 - XFS based Quota (https://github.com/gluster/glusterfs/issues/184) RFC4 - Lease based quotad (https://github.com/gluster/glusterfs/issues/181) Note: These RFC are not interdependent. Thanks and Regards, Sanoj -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170424/74935b57/attachment.html>