thr3ads.net - Lustre discuss - [Lustre-discuss] Quota problem in 1.6.7? [Apr 2009]

If this information is useful, please help other people find it:
Share via:

Nirmal Seenu

2009-Apr-01 18:30 UTC

[Lustre-discuss] Quota problem in 1.6.7?

I am having trouble getting to quota work on Lustre version 1.6.7.

I had quota working fine on Lustre 1.6.6 version and I used to have the 
following setting on the MGS, MDT and OSTs
tunefs.lustre --erase-params --mgs --param lov.stripecount=1 --writeconf 
/dev/mapper/lustre1_volume-mgs_lv

tunefs.lustre --erase-params --mdt --mgsnode=lustre1 at tcp1 --param 
lov.stripecount=1 --writeconf --param mdt.quota_type=ug 
/dev/mapper/lustre1_volume-new_mds_lv

tunefs.lustre --erase-params --ost --mgsnode=lustre1 at tcp1 --param 
ost.quota_type=ug --writeconf /dev/sdc1

Once I upgraded from Lustre 1.6.6 to Lustre 1.6.7, the MDT crashes 
(Kernel panic) instantaneously when I try to mount an OST that has the 
quota enabled. I tried using the latest Lustre patched RHEL5 kernel and 
kernel.org kernel: 2.6.22.14 on the MDT server and that didn''t make any
difference and both kernels paniced instantly on OST mount.

To fix this problem, I had to do remove the parameter ost-quota_type=ug 
on all my OSTs:

tunefs.lustre --erase-params --ost --mgsnode=lustre1 at tcp1 --writeconf 
/dev/sdc1

I am able mount all the OSTs after removing quotas on OSTs and the file 
system is healthy, but Quota is disabled.

I get this error on the MDT server when I mount the various MGS, MDT and 
OST partitions:
LustreError: 6372:0:(quota_master.c:1625:qmaster_recovery_main()) 
qmaster recovery failed! (id:120 type:1 rc:-3)

I get the following error on MDT server when ever I try to do "lfs 
quota" on a client node:
lustre1 kernel: LustreError: 4591:0:(quota_ctl.c:288:client_quota_ctl()) 
ptlrpc_queue_wait failed, rc: -3

Is this a known problem with quotas on 1.6.7? Is there any patches 
available to fix this problem?

Thanks in advance.
Nirmal

Johann Lombardi

2009-Apr-01 23:51 UTC

head link

[Lustre-discuss] Quota problem in 1.6.7?

On Wed, Apr 01, 2009 at 01:30:40PM -0500, Nirmal Seenu
wrote:> Once I upgraded from Lustre 1.6.6 to Lustre 1.6.7, the MDT crashes 
> (Kernel panic) instantaneously when I try to mount an OST that has the 
> quota enabled.
Could you please provide us with the console logs (panic message +
stack trace)?

Cheers,
Johann

Nirmal Seenu

2009-Apr-02 17:37 UTC

head link

[Lustre-discuss] Quota problem in 1.6.7?

We didn''t have anything in place to capture the console logs and I wont
be able to provide you any specific details of the kernel panic at this 
time.

The kernel panic was easily reproducible with Lustre 1.6.7 on lustre 
patched kernel(RHEL5) as well as 2.6.22.14 kernel.

In our configuration we have a separate machine which hosts the MDT and 
MGS with each having their own partition. There are 2 OSSs and each OSS 
exports 6 OSTs amd each OST is 2.7TB in size. The MDT/MGS machine 
crashed consistently when I tried to mount the 5 or 6th(out of 12) OST.

This is the setting on MGS, MDT and all the OSTs that produces the 
kernel panic.

tunefs.lustre --erase-params --mgs --param lov.stripecount=1 --writeconf
/dev/mapper/lustre1_volume-mgs_lv

tunefs.lustre --erase-params --mdt --mgsnode=lustre1 at tcp1 --param
lov.stripecount=1 --writeconf --param mdt.quota_type=ug
/dev/mapper/lustre1_volume-new_mds_lv

tunefs.lustre --erase-params --ost --mgsnode=lustre1 at tcp1 --param
ost.quota_type=ug --writeconf /dev/sdc1

Our file system is in production use right now and I wont be able to 
take a downtime to reproduce this problem. Please let me know if I can 
provide you any other detail at this time.

Thanks
Nirmal

Johann Lombardi wrote:> On Wed, Apr 01, 2009 at 01:30:40PM -0500, Nirmal Seenu wrote:
>> Once I upgraded from Lustre 1.6.6 to Lustre 1.6.7, the MDT crashes 
>> (Kernel panic) instantaneously when I try to mount an OST that has the 
>> quota enabled.
> 
> Could you please provide us with the console logs (panic message +
> stack trace)?
> 
> Cheers,
> Johann

Johann Lombardi

2009-Apr-02 18:39 UTC

head link

[Lustre-discuss] Quota problem in 1.6.7?

Hi,

On Thu, Apr 02, 2009 at 12:37:42PM -0500, Nirmal Seenu
wrote:> We didn''t have anything in place to capture the console logs and I
wont
> be able to provide you any specific details of the kernel panic at this 
> time.
> 
> The kernel panic was easily reproducible with Lustre 1.6.7 on lustre 
> patched kernel(RHEL5) as well as 2.6.22.14 kernel.
We test those kernels regularly and, afaik, we have never hit such a
problem.
> Our file system is in production use right now and I wont be able to 
> take a downtime to reproduce this problem. Please let me know if I can 
> provide you any other detail at this time.
Unfortunately, there is not much we can do without the console logs.

Johann

Lustre discuss - Apr 2009 - Quota problem in 1.6.7?

[Lustre-discuss] Quota problem in 1.6.7?

[Lustre-discuss] Quota problem in 1.6.7?

[Lustre-discuss] Quota problem in 1.6.7?

[Lustre-discuss] Quota problem in 1.6.7?