Hello. I''ve just noticed that OSTs marked as ''inactive'' for some reason don''t count in the quota display. I have a handful of OSTs temporarily marked inactive because they are close to full, and as soon as they are marked inactive, ''lfs quota'' stops showing their contribution to the total. Is this the correct behavior? ''lfs quota -v'' shows each individual OST properly, but the totals are wrong. I don''t actually have quotas set, but I use the quota facility to keep track of how much space my users are using, and having the numbers inconsistent like this is a pain. We''re running 1.8.7wc with kernel 2.6.18-274.3.1.el5_lustre.g9500ebf. Thanks, Kevin All OSTs active: # lfs quota -v -u kevin /export/lustre_1 Disk quotas for user kevin (uid 7260): Filesystem kbytes quota limit grace files quota limit grace /export/lustre_1 428444 0 0 - 1683 0 0 - lustre_1-MDT0000_UUID 748 - 0 - 1683 - 0 - lustre_1-OST0000_UUID 399652 - 0 - - - - - lustre_1-OST0001_UUID 6908 - 0 - - - - - lustre_1-OST0002_UUID 21124 - 0 - - - - - lustre_1-OST0003_UUID 8 - 0 - - - - - lustre_1-OST0004_UUID 4 - 0 - - - - - lustre_1-OST0005_UUID 0 - 0 - - - - - lustre_1-OST0006_UUID 0 - 0 - - - - - OSTs 0000-0003 inactive: deepthought:~# lfs quota -v -u kevin /export/lustre_1 Disk quotas for user kevin (uid 7260): Filesystem kbytes quota limit grace files quota limit grace /export/lustre_1 752 0 0 - 1683 0 0 - lustre_1-MDT0000_UUID 748 - 0 - 1683 - 0 - lustre_1-OST0000_UUID 399652 - 0 - - - - - lustre_1-OST0001_UUID 6908 - 0 - - - - - lustre_1-OST0002_UUID 21124 - 0 - - - - - lustre_1-OST0003_UUID 8 - 0 - - - - - lustre_1-OST0004_UUID 4 - 0 - - - - - lustre_1-OST0005_UUID 0 - 0 - - - - - lustre_1-OST0006_UUID 0 - 0 - - - - - -- Kevin Hildebrand University of Maryland, College Park Division of IT
On Fri, Apr 20, 2012 at 09:02:06AM -0400, Kevin Hildebrand wrote:> I have a handful of OSTs temporarily marked inactive because they are > close to full, and as soon as they are marked inactive, ''lfs quota'' stops > showing their contribution to the total. Is this the correct behavior?No, that''s not the correct behavior. The problem is that the lustre client relies on the MDS to collect real disk usage from all the OSTs (via Q_GETOQUOTA RPCs). This is different in orion (i.e. 2.4) where it is up to the lustre client to retrieve usage from OSTs directly. We landed a patch recently (for compatibility with orion) which might help a little bit, see http://review.whamcloud.com/#change,1570. It allows the client to fetch disk usage directly from OSTs if the MDS does not set QIF_SPACE in the reply (that''s the case of orion MDS). That said, applying this patch still does not bring you where you want since the 1.8 MDS still replies with QIF_SPACE. To address this, you will have to patch the mds_get_dqblk() function in lustre/quota/quota_master.c not to call mds_get_space(). HTH Johann -- Johann Lombardi Whamcloud, Inc. www.whamcloud.com