Hi all, on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the quota: > ~# lfs quota -u troth /lustre > Disk quotas for user troth: > Filesystem kbytes quota limit grace files quota limit grace > /lustre 4 3072000 309200 1 11000 10000 > MDT0000_UUID > 4* 1 1 6400 > OST0000_UUID > 0 16384 Try to reset this quota: > ~# lfs setquota -u troth 0 0 0 0 /lustre > setquota failed: Device or resource busy Use "some" values instead: > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre > setquota failed: Device or resource busy I know the manual says not to use "lfs setquota" to reset quotas but - that is yet another question - of course there is a command "setquota", but it doesn''t know about Lustre > ~# setquota -u troth 0 0 0 0 /lustre > setquota: Mountpoint (or device) /lustre not found. > setquota: Not all specified mountpoints are using quota. as is to be expected. Mistake in the manual? However I''m mainly interested in what causes my system to be busy, when it is not - no writes, not even reads. I did rerun "lfs quotacheck", but that didn''t help, either. Anybody got any hints what to do to manipulate quotas? Thanks, Thomas
Thomas, setquota (from quota-tools) would not work with Lustre filesystems, so you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". lfs can be used either to set quota limits or to reset them and " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to reset quotas. AFAIU, the cause of "Device or resource busy" when setting quota in your case could be that MDS was performing setquota or quota recovery for the user roth. Could you check whether MDS is stuck inside mds_set_dqblk or mds_quota_recovery functions (you can dump strack traces of running threads into kernel log with alt-sysrq-t provided sysctl variable kerne.sysrq equals 1)? Andrew. On Friday 28 November 2008 17:50:51 Thomas Roth wrote:> Hi all, > > on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the > > quota: > > ~# lfs quota -u troth /lustre > > Disk quotas for user troth: > > Filesystem kbytes quota limit grace files quota > > limit grace > > > /lustre 4 3072000 309200 1 11000 10000 > > MDT0000_UUID > > 4* 1 1 6400 > > OST0000_UUID > > 0 16384 > > Try to reset this quota: > > ~# lfs setquota -u troth 0 0 0 0 /lustre > > setquota failed: Device or resource busy > > Use "some" values instead: > > ~# lfs setquota -u troth 104000000 105000000 100000 100000 /lustre > > setquota failed: Device or resource busy > > I know the manual says not to use "lfs setquota" to reset quotas but - > that is yet another question - of course there is a command "setquota", > but it doesn''t know about Lustre > > > ~# setquota -u troth 0 0 0 0 /lustre > > setquota: Mountpoint (or device) /lustre not found. > > setquota: Not all specified mountpoints are using quota. > > as is to be expected. Mistake in the manual? > > However I''m mainly interested in what causes my system to be busy, when > it is not - no writes, not even reads. > I did rerun "lfs quotacheck", but that didn''t help, either. > > Anybody got any hints what to do to manipulate quotas? > > Thanks, > Thomas > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Thomas Roth
2008-Dec-04  13:46 UTC
[Lustre-discuss] More: setquota fails, mds adjust qunit failed
Hi,
I''m still having these problems with resetting and setting quota. My 
Lustre system seems to be forever ''setquota failed: Device or resource 
busy''.
Right now, I have tried to write  as much as my current quota setting 
allows:
# lfs quota -u troth /lustre
Disk quotas for user troth:
      Filesystem  kbytes   quota   limit   grace   files   quota   limit 
   grace
         /lustre       4  3072000  309200               1   11000   10000
lust-MDT0000_UUID
                       4*              1               1            6400
lust-OST0000_UUID
                       0           16384
lust-OST0001_UUID
                       0           22528
...
I wrote some ~ 100 MB with ''dd'', deleted them and tried to
copy a
directory - "Disk quota exceeded"
Now there are several questions: the listing above indicates that on the 
MDT I have exceeded my quota - there''s a 4* - without any data in my 
Lustre directory. But this is only 4kB - who nows what could take up 
4kB. (Another question is how I managed to set the quota on the MDT to 1 
kB in the first place - unfortunately I did not write down my previous 
"lfs setquota" commands while they were still successful.)
Still - how can I write 1 file with 2MB in this situation, and why can I 
not even make the directory (the one I wanted to copy), without any 
files in it, before the quota blocks everything?
But wait - the story goes on. When I try to write with dd of=/dev/zero 
..., the log of the MDT says
  Dec  4 14:20:39 lustre kernel: LustreError: 
3837:0:(quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! 
(opc:4 rc:-16)
This is reproducible and correlates with my write attempts.
So something might be broken here?
I have read further on in the Lustre Manual about quota. It keeps 
talking about parameters found "/proc/fs/lustre/lquota/..." I
don''t have
a subdirectory "lquota" there - neither on the MDT nor on the OSTs.
The
parameters can be found, however, in
"/proc/fs/lustre/mds/lust-MDT0000/"
and "/proc/fs/lustre/obdfilter/lust-OSTxxxx".
Disturbingly enough, "/proc/fs/lustre/mds/lust-MDT0000/quota_type"
reads
"off2"
On one OST, I found it to be "off" . There, I tried
"tunefs.lustre
--param ost.quota_type=ug /dev/sdb1 ", as mentioned in the manual. 
Reading the parameters off the partition with tunefs tells me that the 
quota_type is "ug", the entry 
/proc/fs/lustre/mds/lust-MDT0000/quota_type is still "off".
Now we have had problems with quotas before, but in these cases already 
"lfs quotacheck" would fail. Now, on this system, not only quotacheck 
worked but while I still had quotas set to sensible values before, the 
quota mechanism itself worked as desired. I conclude that this trouble 
is not because I have forgotten to activate quota in some earlier stage 
as kernel compilation or formatting the Lustre partitions.
So I''m lost now and would appreciate any hint.
Oh, all of these servers are running Debian Etch 64bit, kernel 2.6.22, 
Lustre 1.6.5.1
Thomas
Andrew Perepechko wrote:> Thomas,
> 
> setquota (from quota-tools) would not work with Lustre filesystems, so
> you cannot run it like "~#  setquota -u troth 0 0 0 0 /lustre".
> 
> lfs can be used either to set quota limits or to reset them and
> " ~#  lfs setquota -u troth 0 0 0 0 /lustre" is the correct way
to
> reset quotas.
> 
> AFAIU, the cause of "Device or resource busy" when setting quota
> in your case could be that MDS was performing setquota or quota recovery
> for the user roth. Could you check whether MDS is stuck inside
> mds_set_dqblk or mds_quota_recovery functions (you can dump
> strack traces of running threads into kernel log with alt-sysrq-t provided
> sysctl variable kerne.sysrq equals 1)?
> 
> Andrew.
> 
> On Friday 28 November 2008 17:50:51 Thomas Roth wrote:
>> Hi all,
>>
>> on an empty and unused Lustre 1.6.5.1 system I cannot reset or set the
>>
>> quota:
>>  >  ~# lfs quota -u troth /lustre
>>  > Disk quotas for user troth:
>>  >      Filesystem  kbytes   quota   limit   grace   files   quota
>>
>> limit   grace
>>
>>  >         /lustre       4  3072000  309200               1   11000 
10000
>>  > MDT0000_UUID
>>  >                       4*              1               1          
6400
>>  > OST0000_UUID
>>  >                       0           16384
>>
>> Try to reset this quota:
>>  > ~#  lfs setquota -u troth 0 0 0 0 /lustre
>>  > setquota failed: Device or resource busy
>>
>> Use "some" values instead:
>>  > ~# lfs setquota -u troth 104000000 105000000 100000 100000
/lustre
>>  > setquota failed: Device or resource busy
>>
>> I know the manual says not to use "lfs setquota" to reset
quotas but -
>> that is yet another question - of course there is a command
"setquota",
>> but it doesn''t know about Lustre
>>
>>  > ~#  setquota -u troth 0 0 0 0 /lustre
>>  > setquota: Mountpoint (or device) /lustre not found.
>>  > setquota: Not all specified mountpoints are using quota.
>>
>> as is to be expected. Mistake in the manual?
>>
>> However I''m mainly interested in what causes my system to be
busy, when
>> it is not - no writes, not even reads.
>> I did rerun "lfs quotacheck", but that didn''t help,
either.
>>
>> Anybody got any hints what to do to manipulate quotas?
>>
>> Thanks,
>> Thomas
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
GSI Helmholtzzentrum f?r Schwerionenforschung GmbH
Planckstra?e 1
D-64291 Darmstadt
www.gsi.de
Gesellschaft mit beschr?nkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528
Gesch?ftsf?hrer: Professor Dr. Horst St?cker
Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph,
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
Thomas Roth
2008-Dec-04  17:31 UTC
[Lustre-discuss] setquota fails, mds adjust qunit failed, quota_interface.c ... quit checking
Hi,
it seems that with my faulty quota settings I can damage client - OST 
connection quite persistently.
As described earlier, I have tried twice to write to Lustre from one 
client, producing one file and one "Disk quota exceeded". Afterwards, 
this client sets the two OSTs which were affected by my attempts to 
"Inactive".
On the OSS I see corresponding log entries:
    Dec  4 18:02:31 OSS96 kernel: Lustre: 
19425:0:(ldlm_lib.c:760:target_handle_connect()) lust-OST0020: refuse 
reconnec
    tion from dc6bd83f-6971-7f4d-1a22-77825f6d21a5@[IP]@tcp to 
0xffff8101fb223000; still busy with 2 active RPCs
The connection is obviously severed permanently, at least reboot of the 
client does not change anything.
Of course "still busy with N active RPCs" is another of the many
Lustre
cryptics that merit an explanation or better a repair-recipe from the 
experts. Found ever so often in the logs as well as the web but it seems 
that waiting for the self-healing of Lustre is all one can do?
Anyway in my case, these are only the problems after  the attempt to 
write to Lustre. During the attempt, I see on the OST:
    Dec  4 18:03:05 lxfs104 kernel: LustreError: 
20582:0:(quota_interface.c:473:quota_chk_acq_common()) we meet 10 errors 
or run too
    many cycles when acquiring quota, quit checking with rc: 0, cycle: 1000.
And the MDS complains as reported before
    (quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! 
(opc:4 rc:-16)
Btw, I have checked the output of sysrq-t for  mds_set_dqblk or 
mds_quota_recovery as suggested by Andrew yet found nothing.
So, what should I do? Unfortunately the system is already in use by 
other people, so just starting fresh with a "global mkfs.lustre" is
not
an option ;-)
Regards,
Thomas
Thomas Roth wrote:> Hi,
> 
> I''m still having these problems with resetting and setting quota.
My
> Lustre system seems to be forever ''setquota failed: Device or
resource
> busy''.
> Right now, I have tried to write  as much as my current quota setting 
> allows:
> 
> # lfs quota -u troth /lustre
> Disk quotas for user troth:
>       Filesystem  kbytes   quota   limit   grace   files   quota   limit 
>    grace
>          /lustre       4  3072000  309200               1   11000   10000
> lust-MDT0000_UUID
>                        4*              1               1            6400
> lust-OST0000_UUID
>                        0           16384
> lust-OST0001_UUID
>                        0           22528
> ...
> 
> I wrote some ~ 100 MB with ''dd'', deleted them and tried
to copy a
> directory - "Disk quota exceeded"
> Now there are several questions: the listing above indicates that on the 
> MDT I have exceeded my quota - there''s a 4* - without any data in
my
> Lustre directory. But this is only 4kB - who nows what could take up 
> 4kB. (Another question is how I managed to set the quota on the MDT to 1 
> kB in the first place - unfortunately I did not write down my previous 
> "lfs setquota" commands while they were still successful.)
> Still - how can I write 1 file with 2MB in this situation, and why can I 
> not even make the directory (the one I wanted to copy), without any 
> files in it, before the quota blocks everything?
> But wait - the story goes on. When I try to write with dd of=/dev/zero 
> ..., the log of the MDT says
> 
>   Dec  4 14:20:39 lustre kernel: LustreError: 
> 3837:0:(quota_master.c:478:mds_quota_adjust()) mds adjust qunit failed! 
> (opc:4 rc:-16)
> 
> This is reproducible and correlates with my write attempts.
> 
> So something might be broken here?
> 
> I have read further on in the Lustre Manual about quota. It keeps 
> talking about parameters found "/proc/fs/lustre/lquota/..." I
don''t have
> a subdirectory "lquota" there - neither on the MDT nor on the
OSTs. The
> parameters can be found, however, in
"/proc/fs/lustre/mds/lust-MDT0000/"
> and "/proc/fs/lustre/obdfilter/lust-OSTxxxx".
> Disturbingly enough,
"/proc/fs/lustre/mds/lust-MDT0000/quota_type" reads
> "off2"
> On one OST, I found it to be "off" . There, I tried
"tunefs.lustre
> --param ost.quota_type=ug /dev/sdb1 ", as mentioned in the manual. 
> Reading the parameters off the partition with tunefs tells me that the 
> quota_type is "ug", the entry 
> /proc/fs/lustre/mds/lust-MDT0000/quota_type is still "off".
> 
> 
> Now we have had problems with quotas before, but in these cases already 
> "lfs quotacheck" would fail. Now, on this system, not only
quotacheck
> worked but while I still had quotas set to sensible values before, the 
> quota mechanism itself worked as desired. I conclude that this trouble 
> is not because I have forgotten to activate quota in some earlier stage 
> as kernel compilation or formatting the Lustre partitions.
> 
> So I''m lost now and would appreciate any hint.
> 
> Oh, all of these servers are running Debian Etch 64bit, kernel 2.6.22, 
> Lustre 1.6.5.1
> 
> Thomas
> 
> Andrew Perepechko wrote:
>> Thomas,
>>
>> setquota (from quota-tools) would not work with Lustre filesystems, so
>> you cannot run it like "~#  setquota -u troth 0 0 0 0
/lustre".
>>
>> lfs can be used either to set quota limits or to reset them and
>> " ~#  lfs setquota -u troth 0 0 0 0 /lustre" is the correct
way to
>> reset quotas.
>>
>> AFAIU, the cause of "Device or resource busy" when setting
quota
>> in your case could be that MDS was performing setquota or quota
recovery
>> for the user roth. Could you check whether MDS is stuck inside
>> mds_set_dqblk or mds_quota_recovery functions (you can dump
>> strack traces of running threads into kernel log with alt-sysrq-t
provided
>> sysctl variable kerne.sysrq equals 1)?
>>
>> Andrew.
>>
>> On Friday 28 November 2008 17:50:51 Thomas Roth wrote:
>>> Hi all,
>>>
>>> on an empty and unused Lustre 1.6.5.1 system I cannot reset or set
the
>>>
>>> quota:
>>>  >  ~# lfs quota -u troth /lustre
>>>  > Disk quotas for user troth:
>>>  >      Filesystem  kbytes   quota   limit   grace   files  
quota
>>>
>>> limit   grace
>>>
>>>  >         /lustre       4  3072000  309200               1  
11000   10000
>>>  > MDT0000_UUID
>>>  >                       4*              1               1      
6400
>>>  > OST0000_UUID
>>>  >                       0           16384
>>>
>>> Try to reset this quota:
>>>  > ~#  lfs setquota -u troth 0 0 0 0 /lustre
>>>  > setquota failed: Device or resource busy
>>>
>>> Use "some" values instead:
>>>  > ~# lfs setquota -u troth 104000000 105000000 100000 100000
/lustre
>>>  > setquota failed: Device or resource busy
>>>
>>> I know the manual says not to use "lfs setquota" to reset
quotas but -
>>> that is yet another question - of course there is a command
"setquota",
>>> but it doesn''t know about Lustre
>>>
>>>  > ~#  setquota -u troth 0 0 0 0 /lustre
>>>  > setquota: Mountpoint (or device) /lustre not found.
>>>  > setquota: Not all specified mountpoints are using quota.
>>>
>>> as is to be expected. Mistake in the manual?
>>>
>>> However I''m mainly interested in what causes my system to
be busy, when
>>> it is not - no writes, not even reads.
>>> I did rerun "lfs quotacheck", but that didn''t
help, either.
>>>
>>> Anybody got any hints what to do to manipulate quotas?
>>>
>>> Thanks,
>>> Thomas
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
-- 
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
GSI Helmholtzzentrum f?r Schwerionenforschung GmbH
Planckstra?e 1
D-64291 Darmstadt
www.gsi.de
Gesellschaft mit beschr?nkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528
Gesch?ftsf?hrer: Professor Dr. Horst St?cker
Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph,
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
James Beal
2009-Jan-16  14:45 UTC
[Lustre-discuss] setquota fails, mds adjust qunit failed, quota_interface.c ... quit checking
>>I am currently writing the documentation for a new lustre system, which will use quota''s. We to are using Debian Etch 64bit, 2.6.22.19- lustre-1.6.6 Is the correct way to change quota''s to use lfs setquota even though the manual version "Lustre 1.6 Operations Manual ? September 2008 " explicitly says not to do this ?>> Oh, all of these servers are running Debian Etch 64bit, kernel >> 2.6.22, >> Lustre 1.6.5.1 >> >> Thomas >> >> Andrew Perepechko wrote: >>> Thomas, >>> >>> setquota (from quota-tools) would not work with Lustre >>> filesystems, so >>> you cannot run it like "~# setquota -u troth 0 0 0 0 /lustre". >>> >>> lfs can be used either to set quota limits or to reset them and >>> " ~# lfs setquota -u troth 0 0 0 0 /lustre" is the correct way to >>> reset quotas. >>>-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.