Dear list, We are running a small Lustre with 2 OSS( 1 OSS shares server with MDS ) for usrs'' home directory. We have experienced incorrect user''s quota for several times. Certain users got "Quota exceed" errors when their usage of the disk space is only half of quotas. We have this kind of errors on MDS for some times: Aug 24 12:25:20 beshome01 kernel: Lustre: Skipped 3 previous similar messages Aug 24 12:51:29 beshome01 kernel: LustreError: 28467:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) Aug 24 12:52:06 beshome01 kernel: LustreError: 26005:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) Dose this error have relationship with the quota incorrect problem? How can we avoid this situtaion? Best Regards Lu Wang -------------------------------------------------------------- Computing Center IHEP Office: Computing Center,123 Beijing 100049,China Email: Lu.Wang at ihep.ac.cn --------------------------------------------------------------
Hi Yong, We are running lustre-1.6.5-2.6.9_55.EL.cernsmp 32 bit(on client), 2.6.9-67.0.22.EL_lustre.1.6.6smp,64bit(on server). each user has 20GB quota for file size. We have not set file number quota. # lfs quota -u **** /besfs2 Disk quotas for user **** (uid 23034): Filesystem kbytes quota limit grace files quota limit grace /besfs2 4 20000000 20100000 1 0 0 besfs2-MDT0000_UUID 4 131072 1 0 besfs2-OST0000_UUID 0 131072 besfs2-OST0001_UUID 0 131072 besfs2-OST0002_UUID 0 131072 besfs2-OST0003_UUID 0 131072 besfs2-OST0004_UUID 0 131072 besfs2-OST0005_UUID 0 131072 besfs2-OST0006_UUID 0 131072 I am sorry I do not have the error screen shot. Which shows that certain user has not reached the file size quota while an "quota exceed" error was triggered. ------------------ Lu Wang 2009-08-28 ------------------------------------------------------------- ????Fan Yong ?????2009-08-28 17:12:09 ????Lu Wang ??? ???Re: [Lustre-discuss] Incorrect user''s Quota Lu Wang wrote:> Dear list, > We are running a small Lustre with 2 OSS( 1 OSS shares server with MDS ) for usrs'' home directory. We have experienced incorrect user''s quota for several times. Certain users got "Quota exceed" errors when their usage of the disk > space is only half of quotas. > We have this kind of errors on MDS for some times: > Aug 24 12:25:20 beshome01 kernel: Lustre: Skipped 3 previous similar messages > Aug 24 12:51:29 beshome01 kernel: LustreError: 28467:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) > Aug 24 12:52:06 beshome01 kernel: LustreError: 26005:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) > > Dose this error have relationship with the quota incorrect problem? How can we avoid this situtaion? >Which version of lustre you used? What is the "Quota exceed" for: block or file? What is the limitation for your users quota? Provide detail information can help to localization the issues. -- Fan Yong> Best Regards > Lu Wang > -------------------------------------------------------------- > Computing Center > IHEP Office: Computing Center,123 > Beijing 100049,China Email: Lu.Wang at ihep.ac.cn > -------------------------------------------------------------- > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Lu Wang wrote:> Hi Yong, > > We are running lustre-1.6.5-2.6.9_55.EL.cernsmp 32 bit(on client), 2.6.9-67.0.22.EL_lustre.1.6.6smp,64bit(on server). > each user has 20GB quota for file size. We have not set file number quota. > > # lfs quota -u **** /besfs2 > Disk quotas for user **** (uid 23034): > Filesystem kbytes quota limit grace files quota limit grace > /besfs2 4 20000000 20100000 1 0 0 > besfs2-MDT0000_UUID > 4 131072 1 0 > besfs2-OST0000_UUID > 0 131072 > besfs2-OST0001_UUID > 0 131072 > besfs2-OST0002_UUID > 0 131072 > besfs2-OST0003_UUID > 0 131072 > besfs2-OST0004_UUID > 0 131072 > besfs2-OST0005_UUID > 0 131072 > besfs2-OST0006_UUID > 0 131072 > > I am sorry I do not have the error screen shot. Which shows that certain user has not reached the file size quota while > an "quota exceed" error was triggered. >One possible reason is that there are some open-delete (delete files before file real closed) operations, which cause the disk space and quota are not released until files are real closed. On the other hand, lustre quota is distributed, which is not as accurate as local filesystem. That maybe cause "Quota exceed" when near but not hit the limitation. If it is not the above reasons caused your failure, I suggest you to upgrade to lustre 1.6.7 or 1.8.1, some other users have reported the similar issues before, we have fix some quota related issues in such distributions. -- Fan Yong> ------------------ > Lu Wang > 2009-08-28 > > ------------------------------------------------------------- > ????Fan Yong > ?????2009-08-28 17:12:09 > ????Lu Wang > ??? > ???Re: [Lustre-discuss] Incorrect user''s Quota > > Lu Wang wrote: > >> Dear list, >> We are running a small Lustre with 2 OSS( 1 OSS shares server with MDS ) for usrs'' home directory. We have experienced incorrect user''s quota for several times. Certain users got "Quota exceed" errors when their usage of the disk >> space is only half of quotas. >> We have this kind of errors on MDS for some times: >> Aug 24 12:25:20 beshome01 kernel: Lustre: Skipped 3 previous similar messages >> Aug 24 12:51:29 beshome01 kernel: LustreError: 28467:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >> Aug 24 12:52:06 beshome01 kernel: LustreError: 26005:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >> >> Dose this error have relationship with the quota incorrect problem? How can we avoid this situtaion? >> >> > Which version of lustre you used? What is the "Quota exceed" for: block > or file? What is the limitation for your users quota? Provide detail > information can help to localization the issues. > > -- > Fan Yong > >> Best Regards >> Lu Wang >> -------------------------------------------------------------- >> Computing Center >> IHEP Office: Computing Center,123 >> Beijing 100049,China Email: Lu.Wang at ihep.ac.cn >> -------------------------------------------------------------- >> >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Hi Yong, I think it was not the reason you mentioned in first paragragh. To Update Lustre,I still have two questions: 1. Is it possilble to update servers only? I mean, without copy data from an old version Lustre(1.6.6) to a new version(above 1.8),just remount the OSTs. 2. Is 1.8 a stable version or just a transition version for 2.0? If 2.0 is coming in a few months, we are going to update the system directly to 2.0( to reduce the service unavailable times and risk) . Thank you very much in advance. ------------------ Lu Wang 2009-08-31 ------------------------------------------------------------- ????Fan Yong ?????2009-08-28 18:02:59 ????Lu Wang ???lustre-discuss ???Re: [Lustre-discuss] Incorrect user''s Quota Lu Wang wrote:> Hi Yong, > > We are running lustre-1.6.5-2.6.9_55.EL.cernsmp 32 bit(on client), 2.6.9-67.0.22.EL_lustre.1.6.6smp,64bit(on server). > each user has 20GB quota for file size. We have not set file number quota. > > # lfs quota -u **** /besfs2 > Disk quotas for user **** (uid 23034): > Filesystem kbytes quota limit grace files quota limit grace > /besfs2 4 20000000 20100000 1 0 0 > besfs2-MDT0000_UUID > 4 131072 1 0 > besfs2-OST0000_UUID > 0 131072 > besfs2-OST0001_UUID > 0 131072 > besfs2-OST0002_UUID > 0 131072 > besfs2-OST0003_UUID > 0 131072 > besfs2-OST0004_UUID > 0 131072 > besfs2-OST0005_UUID > 0 131072 > besfs2-OST0006_UUID > 0 131072 > > I am sorry I do not have the error screen shot. Which shows that certain user has not reached the file size quota while > an "quota exceed" error was triggered. >One possible reason is that there are some open-delete (delete files before file real closed) operations, which cause the disk space and quota are not released until files are real closed. On the other hand, lustre quota is distributed, which is not as accurate as local filesystem. That maybe cause "Quota exceed" when near but not hit the limitation. If it is not the above reasons caused your failure, I suggest you to upgrade to lustre 1.6.7 or 1.8.1, some other users have reported the similar issues before, we have fix some quota related issues in such distributions. -- Fan Yong> ------------------ > Lu Wang > 2009-08-28 > > ------------------------------------------------------------- > ????Fan Yong > ?????2009-08-28 17:12:09 > ????Lu Wang > ??? > ???Re: [Lustre-discuss] Incorrect user''s Quota > > Lu Wang wrote: > >> Dear list, >> We are running a small Lustre with 2 OSS( 1 OSS shares server with MDS ) for usrs'' home directory. We have experienced incorrect user''s quota for several times. Certain users got "Quota exceed" errors when their usage of the disk >> space is only half of quotas. >> We have this kind of errors on MDS for some times: >> Aug 24 12:25:20 beshome01 kernel: Lustre: Skipped 3 previous similar messages >> Aug 24 12:51:29 beshome01 kernel: LustreError: 28467:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >> Aug 24 12:52:06 beshome01 kernel: LustreError: 26005:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >> >> Dose this error have relationship with the quota incorrect problem? How can we avoid this situtaion? >> >> > Which version of lustre you used? What is the "Quota exceed" for: block > or file? What is the limitation for your users quota? Provide detail > information can help to localization the issues. > > -- > Fan Yong > >> Best Regards >> Lu Wang >> -------------------------------------------------------------- >> Computing Center >> IHEP Office: Computing Center,123 >> Beijing 100049,China Email: Lu.Wang at ihep.ac.cn >> -------------------------------------------------------------- >> >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Lu Wang wrote:> Hi Yong, > I think it was not the reason you mentioned in first paragragh. To Update Lustre,I still have two questions: > 1. Is it possilble to update servers only? I mean, without copy data from an old version Lustre(1.6.6) to a new version(above 1.8),just remount the OSTs. >Yes, lustre supports on-line upgrade, and the on-disk format for 1.6 and 1.8 are compatible. That means, no need to copy data from old version to the new one. When you upgrade your servers, you can follow the steps: 1) install lustre 1.8 modules 2) umount servers 3) unload lustre related modules (1.6 ones) 4) reload lustre related modules (1.8 ones) 5) remount servers> 2. Is 1.8 a stable version or just a transition version for 2.0? If 2.0 is coming in a few months, we are going to update > the system directly to 2.0( to reduce the service unavailable times and risk) . >Sure, lustre 1.8 is a stable version, although it has the function of transition from 1.6 to 2.0. As the schedule, lustre 2.0 will come in a few months, but you can not upgrade your system on-line from 1.6 to 2.0 directly. Regards, -- Fan Yong Lustre Group> Thank you very much in advance. > ------------------ > Lu Wang > 2009-08-31 > > ------------------------------------------------------------- > ????Fan Yong > ?????2009-08-28 18:02:59 > ????Lu Wang > ???lustre-discuss > ???Re: [Lustre-discuss] Incorrect user''s Quota > > Lu Wang wrote: > >> Hi Yong, >> >> We are running lustre-1.6.5-2.6.9_55.EL.cernsmp 32 bit(on client), 2.6.9-67.0.22.EL_lustre.1.6.6smp,64bit(on server). >> each user has 20GB quota for file size. We have not set file number quota. >> >> # lfs quota -u **** /besfs2 >> Disk quotas for user **** (uid 23034): >> Filesystem kbytes quota limit grace files quota limit grace >> /besfs2 4 20000000 20100000 1 0 0 >> besfs2-MDT0000_UUID >> 4 131072 1 0 >> besfs2-OST0000_UUID >> 0 131072 >> besfs2-OST0001_UUID >> 0 131072 >> besfs2-OST0002_UUID >> 0 131072 >> besfs2-OST0003_UUID >> 0 131072 >> besfs2-OST0004_UUID >> 0 131072 >> besfs2-OST0005_UUID >> 0 131072 >> besfs2-OST0006_UUID >> 0 131072 >> >> I am sorry I do not have the error screen shot. Which shows that certain user has not reached the file size quota while >> an "quota exceed" error was triggered. >> >> > One possible reason is that there are some open-delete (delete files > before file real closed) operations, which cause the disk space and > quota are not released until files are real closed. > On the other hand, lustre quota is distributed, which is not as accurate > as local filesystem. That maybe cause "Quota exceed" when near but not > hit the limitation. > If it is not the above reasons caused your failure, I suggest you to > upgrade to lustre 1.6.7 or 1.8.1, some other users have reported the > similar issues before, we have fix some quota related issues in such > distributions. > > -- > Fan Yong > >> ------------------ >> Lu Wang >> 2009-08-28 >> >> ------------------------------------------------------------- >> ????Fan Yong >> ?????2009-08-28 17:12:09 >> ????Lu Wang >> ??? >> ???Re: [Lustre-discuss] Incorrect user''s Quota >> >> Lu Wang wrote: >> >> >>> Dear list, >>> We are running a small Lustre with 2 OSS( 1 OSS shares server with MDS ) for usrs'' home directory. We have experienced incorrect user''s quota for several times. Certain users got "Quota exceed" errors when their usage of the disk >>> space is only half of quotas. >>> We have this kind of errors on MDS for some times: >>> Aug 24 12:25:20 beshome01 kernel: Lustre: Skipped 3 previous similar messages >>> Aug 24 12:51:29 beshome01 kernel: LustreError: 28467:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >>> Aug 24 12:52:06 beshome01 kernel: LustreError: 26005:0:(quota_master.c:507:mds_quota_adjust()) mds adjust qunit failed! (opc:4 rc:-122) >>> >>> Dose this error have relationship with the quota incorrect problem? How can we avoid this situtaion? >>> >>> >>> >> Which version of lustre you used? What is the "Quota exceed" for: block >> or file? What is the limitation for your users quota? Provide detail >> information can help to localization the issues. >> >> -- >> Fan Yong >> >> >>> Best Regards >>> Lu Wang >>> -------------------------------------------------------------- >>> Computing Center >>> IHEP Office: Computing Center,123 >>> Beijing 100049,China Email: Lu.Wang at ihep.ac.cn >>> -------------------------------------------------------------- >>> >>> >>> >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > > >