Hi All, I''m experiencing some problem in a test installation of lustre 1.8.0.X The installation is composed by one server hosting the MDS, and two servers hosting the OSTs. One of the servers has 12x2.7TB devices and the other has 16x2.3TB devices. All the devices were configured with: "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param ost.quota_type=ug --writeconf /dev/sdxx" on the admin node I issued the "lfs quotacheck -ug /lustre" (I see read operation occurring on the both disk servers) that ends without error. I was able to set-up quotas per user on the admin node and it seems successfully registered by checking with: "lfs quota -u donvito /lustre" The problem that I see is that it is possible for a user to overfill the quota as the two server behave differently: one of the two deny writing while the other not. I tried with both lustre rpms and vanilla (2.6.22) patched kernel and the result is the same. It is not related to the physical server as both of them sometimes has the same behaviour (But only one of the server at the time). I have tried with both 1.8.0 and 1.8.0.1 and the same behaviour is observed. As you can see the system is correctly accounting the used space but the server do not deny writing: [root at lustre01 ~]# lfs quota -u donvito /lustre Disk quotas for user donvito (uid 501): Filesystem kbytes quota limit grace files quota limit grace /lustre 124927244* 12000000 15000000 24 100 100 The "messages" log on the MDS and both the OSS servers are clean and nothing strange is noticeable. Do you have any ideas of where I can look in order to understand the problem? Do you have any other suggestion of tests that I can do? Thank you very much. Best Regards, Giacinto ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY ------------------------------------------------------------------ giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it ------------------------------------------------------------------ "A simple design always takes less time to finish than a complex one. So always do the simplest thing that could possibly work." Don Wells at www.extremeprogramming.org "Writing about music is like dancing about architecture." - Frank Zappa<http://feedproxy.google.com/%7Er/randomquotes/%7E3/G2PjcLJ0ONI/> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090626/ee954175/attachment.html
I have the quota working on one of our cluster while I am trying to figure out what is wrong with the second cluster. The cluster where I have quota working, the Lustre servers are running 2.6.18-128.1.6.el5_lustre.1.8.0.1smp kernel and the Lustre clients run on RHEL5.3 distro with the 2.6.18-128.1.10.el5 kernel + Lustre patchless client version 1.8.0.1. The other cluster that I have the same problem that you are facing, the lustre servers are running 2.6.18-92.1.17.el5_lustre.1.6.7.1smp and the clients run on RHEL4.4 with 2.6.21 kernel.org kernel + Lustre patchless clients version 1.6.7. The quota problem seems to be related to the kernel on the Client side. I guess that there is no 64-bit quota support in the 2.6.21 kernel? What configuration do you have on the client side? I would be curious to see if the quota works fine with the 2.6.18-128.1.10.el5 kernel on your clients. Nirmal
Hi Nirmal, I''ve tried with both: " 2.6.18-128.1.6.el5_lustre.1.8.0.1smp x86_64" and "2.6.22.14 x86_64" clients. The results is the same. Cheers, Giacinto -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY ------------------------------------------------------------------ giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it AIM/iChat: gdonvito1 | Yahoo: eric1_it ------------------------------------------------------------------ ?The Linux philosophy is ''Laugh in the face of danger''. Oops. Wrong One. ''Do it yourself''. Yes, that''s it.? Linus Torvalds Il giorno 26/giu/09, alle ore 17:41, Nirmal Seenu ha scritto:> I have the quota working on one of our cluster while I am trying to > figure out what is wrong with the second cluster. > > The cluster where I have quota working, the Lustre servers are running > 2.6.18-128.1.6.el5_lustre.1.8.0.1smp kernel and the Lustre clients run > on RHEL5.3 distro with the 2.6.18-128.1.10.el5 kernel + Lustre > patchless > client version 1.8.0.1. > > The other cluster that I have the same problem that you are facing, > the > lustre servers are running 2.6.18-92.1.17.el5_lustre.1.6.7.1smp and > the > clients run on RHEL4.4 with 2.6.21 kernel.org kernel + Lustre > patchless > clients version 1.6.7. > > The quota problem seems to be related to the kernel on the Client > side. > I guess that there is no 64-bit quota support in the 2.6.21 kernel? > > What configuration do you have on the client side? I would be > curious to > see if the quota works fine with the 2.6.18-128.1.10.el5 kernel on > your > clients. > > Nirmal > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
I tried also using Lustre 1.6.7.2 with the official kernels and it works always in the same way: the first joined server honours the quota denying writing when quota is reached while the second server does allow writing from the same user and from the same client. I also tried to use different kernel at the client side obtaining always the same behaviour. Do anyone have any suggestion or hints? In our installation the quota is absolutely needed. Thank you in advance for any help you may provide. Cheers, Giacinto ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY ------------------------------------------------------------------ giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it ------------------------------------------------------------------ "A simple design always takes less time to finish than a complex one. So always do the simplest thing that could possibly work." Don Wells at www.extremeprogramming.org "To see what is in front of one''s nose needs a constant struggle." - George Orwell <http://feedproxy.google.com/~r/randomquotes/~3/G2PjcLJ0ONI/> On Fri, Jun 26, 2009 at 17:59, Giacinto Donvito <giacinto.donvito at ba.infn.it> wrote:> Hi Nirmal, > > I''ve tried with both: " 2.6.18-128.1.6.el5_lustre.1.8.0.1smp x86_64" and > "2.6.22.14 x86_64" clients. > > The results is the same. > > Cheers, > Giacinto > > -- > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY > ------------------------------------------------------------------ > giacinto.donvito at ba.infn.it | GTalk/GMail: > donvito.giacinto at gmail.com > tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it > VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it > AIM/iChat: gdonvito1 | Yahoo: eric1_it > ------------------------------------------------------------------ > ?The Linux philosophy is ''Laugh in the face of danger''. Oops. Wrong One. > ''Do it yourself''. Yes, that''s it.? > Linus Torvalds > > > > > > Il giorno 26/giu/09, alle ore 17:41, Nirmal Seenu ha scritto: > > > I have the quota working on one of our cluster while I am trying to >> figure out what is wrong with the second cluster. >> >> The cluster where I have quota working, the Lustre servers are running >> 2.6.18-128.1.6.el5_lustre.1.8.0.1smp kernel and the Lustre clients run >> on RHEL5.3 distro with the 2.6.18-128.1.10.el5 kernel + Lustre patchless >> client version 1.8.0.1. >> >> The other cluster that I have the same problem that you are facing, the >> lustre servers are running 2.6.18-92.1.17.el5_lustre.1.6.7.1smp and the >> clients run on RHEL4.4 with 2.6.21 kernel.org kernel + Lustre patchless >> clients version 1.6.7. >> >> The quota problem seems to be related to the kernel on the Client side. >> I guess that there is no 64-bit quota support in the 2.6.21 kernel? >> >> What configuration do you have on the client side? I would be curious to >> see if the quota works fine with the 2.6.18-128.1.10.el5 kernel on your >> clients. >> >> Nirmal >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090701/cd607b44/attachment.html
> Hi All, > > I''m experiencing some problem in a test installation of lustre 1.8.0.X > > The installation is composed by one server hosting the MDS, and two > servers hosting the OSTs. > One of the servers has 12x2.7TB devices and the other has 16x2.3TB > devices. > > All the devices were configured with: > > "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param ost.quota_type=ug > --writeconf /dev/sdxx" > > on the admin node I issued the "lfs quotacheck -ug /lustre" (I see > read operation occurring on the both disk servers) that ends without > error. > > I was able to set-up quotas per user on the admin node and it seems > successfully registered by checking with: "lfs quota -u donvito /lustre" > > The problem that I see is that it is possible for a user to overfill > the quota as the two server behave differently: one of the two deny > writing while the other not. > I tried with both lustre rpms and vanilla (2.6.22) patched kernel and > the result is the same. It is not related to the physical server as > both of them sometimes has the same behaviour (But only one of the > server at the time). I have tried with both 1.8.0 and 1.8.0.1 and the > same behaviour is observed. > > As you can see the system is correctly accounting the used space but > the server do not deny writing: > > [root at lustre01 ~]# lfs quota -u donvito /lustre > Disk quotas for user donvito (uid 501): > Filesystem kbytes quota limit grace files quota > limit grace > /lustre 124927244* 12000000 15000000 24 > 100 100(hint)You can use "lfs quota -v ..." to get how much quota grant every quota slave has.> > The "messages" log on the MDS and both the OSS servers are clean and > nothing strange is noticeable. > Do you have any ideas of where I can look in order to understand the > problem?It is expected by the current lquota design of lustre. For lustre quota, there are two kinds of roles: quota master(mds) and quota slaves(osts). When you set quota, the limitation is recorded on quota master. When data is written on osts, osts will get some quota grant from quota master if remained quota on osts isn''t enough. But, at the same time, osts will get some kinds of "quota grant cache" so that quota slaves won''t ask quota from quota master every time when they write(if so, performance will be hurted). Then every quota slave will judge if the request it received will trigger out of quota based upon the grant quota it got from mds _respectively_. Then you get what you saw.> Do you have any other suggestion of tests that I can do?For causes of performance, lquota of lustre is distributed and isn''t exactly like local quota(e.g. ext3). So what you saw is normal to lquota, currently you can only change you application to adapt it if it hurt you> > Thank you very much. > > Best Regards, > Giacinto > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY > ------------------------------------------------------------------ > giacinto.donvito at ba.infn.it <mailto:giacinto.donvito at ba.infn.it> > | GTalk/GMail: donvito.giacinto at gmail.com > <mailto:donvito.giacinto at gmail.com> > tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | > MSN: donvito.giacinto at hotmail.it <mailto:donvito.giacinto at hotmail.it> > Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it > ------------------------------------------------------------------ > "A simple design always takes less time to finish than a complex one. > So always do the simplest thing that could possibly work." > Don Wells at www.extremeprogramming.org > <mailto:Wells at www.extremeprogramming.org> > > "Writing about music is like dancing about architecture." - Frank > Zappa <http://feedproxy.google.com/%7Er/randomquotes/%7E3/G2PjcLJ0ONI/> > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Thank you Zhiyong, with this hint I was able to find a way to solve the problem. Cheers, Giacinto -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY ------------------------------------------------------------------ giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it AIM/iChat: gdonvito1 | Yahoo: eric1_it ------------------------------------------------------------------ Life is something that everyone should try at least once. Henry J. Tillman Il giorno 01/lug/09, alle ore 12:24, Zhiyong Landen tian ha scritto:> >> Hi All, >> >> I''m experiencing some problem in a test installation of lustre >> 1.8.0.X >> >> The installation is composed by one server hosting the MDS, and two >> servers hosting the OSTs. >> One of the servers has 12x2.7TB devices and the other has 16x2.3TB >> devices. >> >> All the devices were configured with: >> >> "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param >> ost.quota_type=ug --writeconf /dev/sdxx" >> >> on the admin node I issued the "lfs quotacheck -ug /lustre" (I see >> read operation occurring on the both disk servers) that ends >> without error. >> >> I was able to set-up quotas per user on the admin node and it seems >> successfully registered by checking with: "lfs quota -u donvito / >> lustre" >> >> The problem that I see is that it is possible for a user to >> overfill the quota as the two server behave differently: one of the >> two deny writing while the other not. >> I tried with both lustre rpms and vanilla (2.6.22) patched kernel >> and the result is the same. It is not related to the physical >> server as both of them sometimes has the same behaviour (But only >> one of the server at the time). I have tried with both 1.8.0 and >> 1.8.0.1 and the same behaviour is observed. >> >> As you can see the system is correctly accounting the used space >> but the server do not deny writing: >> >> [root at lustre01 ~]# lfs quota -u donvito /lustre >> Disk quotas for user donvito (uid 501): >> Filesystem kbytes quota limit grace files quota >> limit grace >> /lustre 124927244* 12000000 15000000 24 >> 100 100 > > (hint)You can use "lfs quota -v ..." to get how much quota grant > every quota slave has. > >> >> The "messages" log on the MDS and both the OSS servers are clean >> and nothing strange is noticeable. >> Do you have any ideas of where I can look in order to understand >> the problem? > It is expected by the current lquota design of lustre. For lustre > quota, there are two kinds of roles: quota master(mds) and quota > slaves(osts). When you set quota, the limitation is recorded on > quota master. When data is written on osts, osts will get some > quota grant from quota master if remained quota on osts isn''t > enough. But, at the same time, osts will get some kinds of "quota > grant cache" so that quota slaves won''t ask quota from quota master > every time when they write(if so, performance will be hurted). Then > every quota slave will judge if the request it received will trigger > out of quota based upon the grant quota it got from mds > _respectively_. Then you get what you saw. >> Do you have any other suggestion of tests that I can do? > For causes of performance, lquota of lustre is distributed and isn''t > exactly like local quota(e.g. ext3). So what you saw is normal to > lquota, currently you can only change you application to adapt it if > it hurt you >> >> Thank you very much. >> >> Best Regards, >> Giacinto >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY >> ------------------------------------------------------------------ >> giacinto.donvito at ba.infn.it >> <mailto:giacinto.donvito at ba.infn.it> | GTalk/ >> GMail: donvito.giacinto at gmail.com <mailto:donvito.giacinto at gmail.com> >> tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | >> MSN: donvito.giacinto at hotmail.it <mailto:donvito.giacinto at hotmail.it> >> Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it >> ------------------------------------------------------------------ >> "A simple design always takes less time to finish than a complex one. >> So always do the simplest thing that could possibly work." >> Don Wells at www.extremeprogramming.org <mailto:Wells at www.extremeprogramming.org >> > >> >> "Writing about music is like dancing about architecture." - Frank >> Zappa <http://feedproxy.google.com/%7Er/randomquotes/%7E3/G2PjcLJ0ONI/ >> > >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1760 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090701/6ee815e0/attachment.bin
How did you solve this, we will be implementing quotas on our system soon and don''t want to fall into the same trap. Thanks, Robert LeBlanc Life Sciences & Undergraduate Education Computer Support Brigham Young University On Wed, Jul 1, 2009 at 5:53 AM, Giacinto Donvito < giacinto.donvito at ba.infn.it> wrote:> Thank you Zhiyong, > > with this hint I was able to find a way to solve the problem. > > Cheers, > Giacinto > > -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY > ------------------------------------------------------------------ > giacinto.donvito at ba.infn.it | GTalk/GMail: > donvito.giacinto at gmail.com > tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it > VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it > AIM/iChat: gdonvito1 | Yahoo: eric1_it > ------------------------------------------------------------------ > Life is something that everyone should try at least once. > Henry J. Tillman > > > > > > Il giorno 01/lug/09, alle ore 12:24, Zhiyong Landen tian ha scritto: > > > >> Hi All, >>> >>> I''m experiencing some problem in a test installation of lustre 1.8.0.X >>> >>> The installation is composed by one server hosting the MDS, and two >>> servers hosting the OSTs. >>> One of the servers has 12x2.7TB devices and the other has 16x2.3TB >>> devices. >>> >>> All the devices were configured with: >>> >>> "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param ost.quota_type=ug >>> --writeconf /dev/sdxx" >>> >>> on the admin node I issued the "lfs quotacheck -ug /lustre" (I see read >>> operation occurring on the both disk servers) that ends without error. >>> >>> I was able to set-up quotas per user on the admin node and it seems >>> successfully registered by checking with: "lfs quota -u donvito /lustre" >>> >>> The problem that I see is that it is possible for a user to overfill the >>> quota as the two server behave differently: one of the two deny writing >>> while the other not. >>> I tried with both lustre rpms and vanilla (2.6.22) patched kernel and the >>> result is the same. It is not related to the physical server as both of them >>> sometimes has the same behaviour (But only one of the server at the time). I >>> have tried with both 1.8.0 and 1.8.0.1 and the same behaviour is observed. >>> >>> As you can see the system is correctly accounting the used space but the >>> server do not deny writing: >>> >>> [root at lustre01 ~]# lfs quota -u donvito /lustre >>> Disk quotas for user donvito (uid 501): >>> Filesystem kbytes quota limit grace files quota limit >>> grace >>> /lustre 124927244* 12000000 15000000 24 100 >>> 100 >>> >> >> (hint)You can use "lfs quota -v ..." to get how much quota grant every >> quota slave has. >> >> >>> The "messages" log on the MDS and both the OSS servers are clean and >>> nothing strange is noticeable. >>> Do you have any ideas of where I can look in order to understand the >>> problem? >>> >> It is expected by the current lquota design of lustre. For lustre quota, >> there are two kinds of roles: quota master(mds) and quota slaves(osts). When >> you set quota, the limitation is recorded on quota master. When data is >> written on osts, osts will get some quota grant from quota master if >> remained quota on osts isn''t enough. But, at the same time, osts will get >> some kinds of "quota grant cache" so that quota slaves won''t ask quota from >> quota master every time when they write(if so, performance will be hurted). >> Then every quota slave will judge if the request it received will trigger >> out of quota based upon the grant quota it got from mds _respectively_. Then >> you get what you saw. >> >>> Do you have any other suggestion of tests that I can do? >>> >> For causes of performance, lquota of lustre is distributed and isn''t >> exactly like local quota(e.g. ext3). So what you saw is normal to lquota, >> currently you can only change you application to adapt it if it hurt you >> >>> >>> Thank you very much. >>> >>> Best Regards, >>> Giacinto >>> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY >>> ------------------------------------------------------------------ >>> giacinto.donvito at ba.infn.it <mailto:giacinto.donvito at ba.infn.it> >>> | GTalk/GMail: donvito.giacinto at gmail.com <mailto: >>> donvito.giacinto at gmail.com> >>> tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | MSN: >>> donvito.giacinto at hotmail.it <mailto:donvito.giacinto at hotmail.it> >>> Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it >>> ------------------------------------------------------------------ >>> "A simple design always takes less time to finish than a complex one. >>> So always do the simplest thing that could possibly work." >>> Don Wells at www.extremeprogramming.org <mailto: >>> Wells at www.extremeprogramming.org> >>> >>> "Writing about music is like dancing about architecture." - Frank Zappa < >>> http://feedproxy.google.com/%7Er/randomquotes/%7E3/G2PjcLJ0ONI/> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090701/69458655/attachment-0001.html
Basically, when a new OSS server was added, I followed this procedure (I''m not sure that all the steps are needed but this works with our installation): quotainv quotacheck lfs setquota -u donvito --block-softlimit 0 --block-hardlimit 0 /lustre lfs setquota -u donvito --block-softlimit 2000000 --block-hardlimit 1000000 /lustre after this all the OSTs have the quota showed correctly. I hope that this will be useful for you too. Cheers, Giacinto -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY ------------------------------------------------------------------ giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it AIM/iChat: gdonvito1 | Yahoo: eric1_it ------------------------------------------------------------------ "Not everything that can be counted counts, and not everything that counts can be counted." - Albert Einstein (1879-1955) Il giorno 01/lug/09, alle ore 16:59, Robert LeBlanc ha scritto:> How did you solve this, we will be implementing quotas on our system > soon and don''t want to fall into the same trap. > > Thanks, > > Robert LeBlanc > Life Sciences & Undergraduate Education Computer Support > Brigham Young University > > > On Wed, Jul 1, 2009 at 5:53 AM, Giacinto Donvito <giacinto.donvito at ba.infn.it > > wrote: > Thank you Zhiyong, > > with this hint I was able to find a way to solve the problem. > > Cheers, > Giacinto > > -- -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Giacinto Donvito LIBI -- EGEE3 SA1 INFN - Bari ITALY > ------------------------------------------------------------------ > giacinto.donvito at ba.infn.it | GTalk/GMail: donvito.giacinto at gmail.com > tel. +39 080 5443244 Fax +39 0805442470 | Skype: giacinto_it > > VOIP: +41225481596 | MSN: donvito.giacinto at hotmail.it > AIM/iChat: gdonvito1 | Yahoo: eric1_it > ------------------------------------------------------------------ > Life is something that everyone should try at least once. > Henry J. Tillman > > > > > > Il giorno 01/lug/09, alle ore 12:24, Zhiyong Landen tian ha scritto: > > > > Hi All, > > I''m experiencing some problem in a test installation of lustre 1.8.0.X > > The installation is composed by one server hosting the MDS, and two > servers hosting the OSTs. > One of the servers has 12x2.7TB devices and the other has 16x2.3TB > devices. > > All the devices were configured with: > > "tunefs.lustre --ost --mgsnode=lustre01 at tcp0 --param > ost.quota_type=ug --writeconf /dev/sdxx" > > on the admin node I issued the "lfs quotacheck -ug /lustre" (I see > read operation occurring on the both disk servers) that ends without > error. > > I was able to set-up quotas per user on the admin node and it seems > successfully registered by checking with: "lfs quota -u donvito / > lustre" > > The problem that I see is that it is possible for a user to overfill > the quota as the two server behave differently: one of the two deny > writing while the other not. > I tried with both lustre rpms and vanilla (2.6.22) patched kernel > and the result is the same. It is not related to the physical server > as both of them sometimes has the same behaviour (But only one of > the server at the time). I have tried with both 1.8.0 and 1.8.0.1 > and the same behaviour is observed. > > As you can see the system is correctly accounting the used space but > the server do not deny writing: > > [root at lustre01 ~]# lfs quota -u donvito /lustre > Disk quotas for user donvito (uid 501): > Filesystem kbytes quota limit grace files quota > limit grace > /lustre 124927244* 12000000 15000000 24 > 100 100 > > (hint)You can use "lfs quota -v ..." to get how much quota grant > every quota slave has. > > > The "messages" log on the MDS and both the OSS servers are clean and > nothing strange is noticeable. > Do you have any ideas of where I can look in order to understand the > problem? > It is expected by the current lquota design of lustre. For lustre > quota, there are two kinds of roles: quota master(mds) and quota > slaves(osts). When you set quota, the limitation is recorded on > quota master. When data is written on osts, osts will get some > quota grant from quota master if remained quota on osts isn''t > enough. But, at the same time, osts will get some kinds of "quota > grant cache" so that quota slaves won''t ask quota from quota master > every time when they write(if so, performance will be hurted). Then > every quota slave will judge if the request it received will trigger > out of quota based upon the grant quota it got from mds > _respectively_. Then you get what you saw. > Do you have any other suggestion of tests that I can do? > For causes of performance, lquota of lustre is distributed and isn''t > exactly like local quota(e.g. ext3). So what you saw is normal to > lquota, currently you can only change you application to adapt it if > it hurt you > > Thank you very much. > > Best Regards, > Giacinto > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Giacinto Donvito LIBI -- EGEE2 SA1 INFN - Bari ITALY > ------------------------------------------------------------------ > giacinto.donvito at ba.infn.it > <mailto:giacinto.donvito at ba.infn.it> | GTalk/ > GMail: donvito.giacinto at gmail.com <mailto:donvito.giacinto at gmail.com> > tel. +39 080 5443244 Fax +39 0805442470 VOIP: +41225481596 | > MSN: donvito.giacinto at hotmail.it <mailto:donvito.giacinto at hotmail.it> > Skype: giacinto_it | AIM/iChat: gdonvito1 | Yahoo: eric1_it > ------------------------------------------------------------------ > "A simple design always takes less time to finish than a complex one. > So always do the simplest thing that could possibly work." > Don Wells at www.extremeprogramming.org <mailto:Wells at www.extremeprogramming.org > > > > "Writing about music is like dancing about architecture." - Frank > Zappa <http://feedproxy.google.com/%7Er/randomquotes/%7E3/ > G2PjcLJ0ONI/> > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090701/343cc4be/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1760 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090701/343cc4be/attachment.bin