Hi all, on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the quota: > ~# lfs quota -u troth /lustre > Disk quotas for user troth: > Filesystem kbytes quota limit grace files quota limit grace > /lustre 4 3072000 309200 1 11000 10000 > MDT0000_UUID > 4* 1 1 6400 > OST0000_UUID > 0 16384 Try to reset this quota: > ~# lfs setquota -u troth 0 0 0 0 /lustre > setquota failed: Device or resource busy Use "some" values instead: > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre > setquota failed: Device or resource busy I know the manual says not to use "lfs setquota" to reset quotas but - that is yet another question - of course there is a command "setquota", but it doesn''t know about Lustre > ~# setquota -u troth 0 0 0 0 /lustre > setquota: Mountpoint (or device) /lustre not found. > setquota: Not all specified mountpoints are using quota. as is to be expected. Mistake in the manual? However I''m mainly interested in what causes my system to be busy, when it is not - no writes, not even reads. I did rerun "lfs quotacheck", but that didn''t help, either. Anybody got any hints what to do to manipulate quotas? Thanks, Thomas
Thomas, setquota (from quota-tools) would not work with Lustre filesystems, so you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". lfs can be used either to set quota limits or to reset them and " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to reset quotas. AFAIU, the cause of "Device or resource busy" when setting quota in your case could be that MDS was performing setquota or quota recovery for the user roth. Could you check whether MDS is stuck inside mds_set_dqblk or mds_quota_recovery functions (you can dump strack traces of running threads into kernel log with alt-sysrq-t provided sysctl variable kerne.sysrq equals 1)? Andrew. On Friday 28 November 2008 17:50:51 Thomas Roth wrote:> Hi all, > > on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the > > quota: > > ~# lfs quota -u troth /lustre > > Disk quotas for user troth: > > Filesystem kbytes quota limit grace files quota > > limit grace > > > /lustre 4 3072000 309200 1 11000 10000 > > MDT0000_UUID > > 4* 1 1 6400 > > OST0000_UUID > > 0 16384 > > Try to reset this quota: > > ~# lfs setquota -u troth 0 0 0 0 /lustre > > setquota failed: Device or resource busy > > Use "some" values instead: > > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre > > setquota failed: Device or resource busy > > I know the manual says not to use "lfs setquota" to reset quotas but - > that is yet another question - of course there is a command "setquota", > but it doesn''t know about Lustre > > > ~# setquota -u troth 0 0 0 0 /lustre > > setquota: Mountpoint (or device) /lustre not found. > > setquota: Not all specified mountpoints are using quota. > > as is to be expected. Mistake in the manual? > > However I''m mainly interested in what causes my system to be busy, when > it is not - no writes, not even reads. > I did rerun "lfs quotacheck", but that didn''t help, either. > > Anybody got any hints what to do to manipulate quotas? > > Thanks, > Thomas > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Thomas Roth
2008-Dec-04 13:46 UTC
[Lustre-discuss] More: setquota fails, mds adjust qunit failed
Hi, I''m still having these problems with resetting and setting quota. My Lustre system seems to be forever ''setquota failed: Device or resource busy''. Right now, I have tried to write as much as my current quota setting allows: # lfs quota -u troth /lustre Disk quotas for user troth: Filesystem kbytes quota limit grace files quota limit grace /lustre 4 3072000 309200 1 11000 10000 lust-MDT0000_UUID 4* 1 1 6400 lust-OST0000_UUID 0 16384 lust-OST0001_UUID 0 22528 ... I wrote some ~ 100 MB with ''dd'', deleted them and tried to copy a directory - "Disk quota exceeded" Now there are several questions: the listing above indicates that on the MDT I have exceeded my quota - there''s a 4* - without any data in my Lustre directory. But this is only 4kB - who nows what could take up 4kB. (Another question is how I managed to set the quota on the MDT to 1 kB in the first place - unfortunately I did not write down my previous "lfs setquota" commands while they were still successful.) Still - how can I write 1 file with 2MB in this situation, and why can I not even make the directory (the one I wanted to copy), without any files in it, before the quota blocks everything? But wait - the story goes on. When I try to write with dd of=/dev/zero ..., the log of the MDT says Dec 4 14:20:39 lustre kernel: LustreError: 3837:0:(quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-16) This is reproducible and correlates with my write attempts. So something might be broken here? I have read further on in the Lustre Manual about quota. It keeps talking about parameters found "/proc/fs/lustre/lquota/..." I don''t have a subdirectory "lquota" there - neither on the MDT nor on the OSTs. The parameters can be found, however, in "/proc/fs/lustre/mds/lust-MDT0000/" and "/proc/fs/lustre/obdfilter/lust-OSTxxxx". Disturbingly enough, "/proc/fs/lustre/mds/lust-MDT0000/quota_type" reads "off2" On one OST, I found it to be "off" . There, I tried "tunefs.lustre --param ost.quota_type=ug /dev/sdb1 ", as mentioned in the manual. Reading the parameters off the partition with tunefs tells me that the quota_type is "ug", the entry /proc/fs/lustre/mds/lust-MDT0000/quota_type is still "off". Now we have had problems with quotas before, but in these cases already "lfs quotacheck" would fail. Now, on this system, not only quotacheck worked but while I still had quotas set to sensible values before, the quota mechanism itself worked as desired. I conclude that this trouble is not because I have forgotten to activate quota in some earlier stage as kernel compilation or formatting the Lustre partitions. So I''m lost now and would appreciate any hint. Oh, all of these servers are running Debian Etch 64bit, kernel 2.6.22, Lustre 1.6.5.1 Thomas Andrew Perepechko wrote:> Thomas, > > setquota (from quota-tools) would not work with Lustre filesystems, so > you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". > > lfs can be used either to set quota limits or to reset them and > " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to > reset quotas. > > AFAIU, the cause of "Device or resource busy" when setting quota > in your case could be that MDS was performing setquota or quota recovery > for the user roth. Could you check whether MDS is stuck inside > mds_set_dqblk or mds_quota_recovery functions (you can dump > strack traces of running threads into kernel log with alt-sysrq-t provided > sysctl variable kerne.sysrq equals 1)? > > Andrew. > > On Friday 28 November 2008 17:50:51 Thomas Roth wrote: >> Hi all, >> >> on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the >> >> quota: >> > ~# lfs quota -u troth /lustre >> > Disk quotas for user troth: >> > Filesystem kbytes quota limit grace files quota >> >> limit grace >> >> > /lustre 4 3072000 309200 1 11000 10000 >> > MDT0000_UUID >> > 4* 1 1 6400 >> > OST0000_UUID >> > 0 16384 >> >> Try to reset this quota: >> > ~# lfs setquota -u troth 0 0 0 0 /lustre >> > setquota failed: Device or resource busy >> >> Use "some" values instead: >> > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre >> > setquota failed: Device or resource busy >> >> I know the manual says not to use "lfs setquota" to reset quotas but - >> that is yet another question - of course there is a command "setquota", >> but it doesn''t know about Lustre >> >> > ~# setquota -u troth 0 0 0 0 /lustre >> > setquota: Mountpoint (or device) /lustre not found. >> > setquota: Not all specified mountpoints are using quota. >> >> as is to be expected. Mistake in the manual? >> >> However I''m mainly interested in what causes my system to be busy, when >> it is not - no writes, not even reads. >> I did rerun "lfs quotacheck", but that didn''t help, either. >> >> Anybody got any hints what to do to manipulate quotas? >> >> Thanks, >> Thomas >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum f?r Schwerionenforschung GmbH Planckstra?e 1 D-64291 Darmstadt www.gsi.de Gesellschaft mit beschr?nkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Gesch?ftsf?hrer: Professor Dr. Horst St?cker Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph, Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
Thomas Roth
2008-Dec-04 17:31 UTC
[Lustre-discuss] setquota fails, mds adjust qunit failed, quota_interface.c ... quit checking
Hi, it seems that with my faulty quota settings I can damage client - OST connection quite persistently. As described earlier, I have tried twice to write to Lustre from one client, producing one file and one "Disk quota exceeded". Afterwards, this client sets the two OSTs which were affected by my attempts to "Inactive". On the OSS I see corresponding log entries: Dec 4 18:02:31 OSS96 kernel: Lustre: 19425:0:(ldlm_lib.c:760:target_handle_connect()) lust-OST0020: refuse reconnec tion from dc6bd83f-6971-7f4d-1a22-77825f6d21a5@[IP]@tcp to 0xffff8101fb223000; still busy with 2 active RPCs The connection is obviously severed permanently, at least reboot of the client does not change anything. Of course "still busy with N active RPCs" is another of the many Lustre cryptics that merit an explanation or better a repair-recipe from the experts. Found ever so often in the logs as well as the web but it seems that waiting for the self-healing of Lustre is all one can do? Anyway in my case, these are only the problems after the attempt to write to Lustre. During the attempt, I see on the OST: Dec 4 18:03:05 lxfs104 kernel: LustreError: 20582:0:(quota_interface.c:473:quota_chk_acq_common()) we meet 10 errors or run too many cycles when acquiring quota, quit checking with rc: 0, cycle: 1000. And the MDS complains as reported before (quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-16) Btw, I have checked the output of sysrq-t for mds_set_dqblk or mds_quota_recovery as suggested by Andrew yet found nothing. So, what should I do? Unfortunately the system is already in use by other people, so just starting fresh with a "global mkfs.lustre" is not an option ;-) Regards, Thomas Thomas Roth wrote:> Hi, > > I''m still having these problems with resetting and setting quota. My > Lustre system seems to be forever ''setquota failed: Device or resource > busy''. > Right now, I have tried to write as much as my current quota setting > allows: > > # lfs quota -u troth /lustre > Disk quotas for user troth: > Filesystem kbytes quota limit grace files quota limit > grace > /lustre 4 3072000 309200 1 11000 10000 > lust-MDT0000_UUID > 4* 1 1 6400 > lust-OST0000_UUID > 0 16384 > lust-OST0001_UUID > 0 22528 > ... > > I wrote some ~ 100 MB with ''dd'', deleted them and tried to copy a > directory - "Disk quota exceeded" > Now there are several questions: the listing above indicates that on the > MDT I have exceeded my quota - there''s a 4* - without any data in my > Lustre directory. But this is only 4kB - who nows what could take up > 4kB. (Another question is how I managed to set the quota on the MDT to 1 > kB in the first place - unfortunately I did not write down my previous > "lfs setquota" commands while they were still successful.) > Still - how can I write 1 file with 2MB in this situation, and why can I > not even make the directory (the one I wanted to copy), without any > files in it, before the quota blocks everything? > But wait - the story goes on. When I try to write with dd of=/dev/zero > ..., the log of the MDT says > > Dec 4 14:20:39 lustre kernel: LustreError: > 3837:0:(quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! > (opc:4 rc:-16) > > This is reproducible and correlates with my write attempts. > > So something might be broken here? > > I have read further on in the Lustre Manual about quota. It keeps > talking about parameters found "/proc/fs/lustre/lquota/..." I don''t have > a subdirectory "lquota" there - neither on the MDT nor on the OSTs. The > parameters can be found, however, in "/proc/fs/lustre/mds/lust-MDT0000/" > and "/proc/fs/lustre/obdfilter/lust-OSTxxxx". > Disturbingly enough, "/proc/fs/lustre/mds/lust-MDT0000/quota_type" reads > "off2" > On one OST, I found it to be "off" . There, I tried "tunefs.lustre > --param ost.quota_type=ug /dev/sdb1 ", as mentioned in the manual. > Reading the parameters off the partition with tunefs tells me that the > quota_type is "ug", the entry > /proc/fs/lustre/mds/lust-MDT0000/quota_type is still "off". > > > Now we have had problems with quotas before, but in these cases already > "lfs quotacheck" would fail. Now, on this system, not only quotacheck > worked but while I still had quotas set to sensible values before, the > quota mechanism itself worked as desired. I conclude that this trouble > is not because I have forgotten to activate quota in some earlier stage > as kernel compilation or formatting the Lustre partitions. > > So I''m lost now and would appreciate any hint. > > Oh, all of these servers are running Debian Etch 64bit, kernel 2.6.22, > Lustre 1.6.5.1 > > Thomas > > Andrew Perepechko wrote: >> Thomas, >> >> setquota (from quota-tools) would not work with Lustre filesystems, so >> you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". >> >> lfs can be used either to set quota limits or to reset them and >> " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to >> reset quotas. >> >> AFAIU, the cause of "Device or resource busy" when setting quota >> in your case could be that MDS was performing setquota or quota recovery >> for the user roth. Could you check whether MDS is stuck inside >> mds_set_dqblk or mds_quota_recovery functions (you can dump >> strack traces of running threads into kernel log with alt-sysrq-t provided >> sysctl variable kerne.sysrq equals 1)? >> >> Andrew. >> >> On Friday 28 November 2008 17:50:51 Thomas Roth wrote: >>> Hi all, >>> >>> on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the >>> >>> quota: >>> > ~# lfs quota -u troth /lustre >>> > Disk quotas for user troth: >>> > Filesystem kbytes quota limit grace files quota >>> >>> limit grace >>> >>> > /lustre 4 3072000 309200 1 11000 10000 >>> > MDT0000_UUID >>> > 4* 1 1 6400 >>> > OST0000_UUID >>> > 0 16384 >>> >>> Try to reset this quota: >>> > ~# lfs setquota -u troth 0 0 0 0 /lustre >>> > setquota failed: Device or resource busy >>> >>> Use "some" values instead: >>> > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre >>> > setquota failed: Device or resource busy >>> >>> I know the manual says not to use "lfs setquota" to reset quotas but - >>> that is yet another question - of course there is a command "setquota", >>> but it doesn''t know about Lustre >>> >>> > ~# setquota -u troth 0 0 0 0 /lustre >>> > setquota: Mountpoint (or device) /lustre not found. >>> > setquota: Not all specified mountpoints are using quota. >>> >>> as is to be expected. Mistake in the manual? >>> >>> However I''m mainly interested in what causes my system to be busy, when >>> it is not - no writes, not even reads. >>> I did rerun "lfs quotacheck", but that didn''t help, either. >>> >>> Anybody got any hints what to do to manipulate quotas? >>> >>> Thanks, >>> Thomas >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum f?r Schwerionenforschung GmbH Planckstra?e 1 D-64291 Darmstadt www.gsi.de Gesellschaft mit beschr?nkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Gesch?ftsf?hrer: Professor Dr. Horst St?cker Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph, Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
James Beal
2009-Jan-16 14:45 UTC
[Lustre-discuss] setquota fails, mds adjust qunit failed, quota_interface.c ... quit checking
>>I am currently writing the documentation for a new lustre system, which will use quota''s. We to are using Debian Etch 64bit, 2.6.22.19- lustre-1.6.6 Is the correct way to change quota''s to use lfs setquota even though the manual version "Lustre 1.6 Operations Manual ? September 2008 " explicitly says not to do this ?>> Oh, all of these servers are running Debian Etch 64bit, kernel >> 2.6.22, >> Lustre 1.6.5.1 >> >> Thomas >> >> Andrew Perepechko wrote: >>> Thomas, >>> >>> setquota (from quota-tools) would not work with Lustre >>> filesystems, so >>> you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". >>> >>> lfs can be used either to set quota limits or to reset them and >>> " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to >>> reset quotas. >>>-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.