Dear all, which is the correct sequence for a MDT/MGS server shutdown? Because, when I have to shutdown the server, first I try to umount the lustreFS on all client and after I reboot the server. But the client keeps a lot of time to umount the filesystem, more than 20 minutes. (I ''don''t know if is an involved parameter, but my /proc/sys/lustre/timeout is 100) Moreover, if during the client umounting I have to poweroff the server, after the server power up, I''m unable to remount lustreFS on the clients without reboot each client, because if I try to enter in the lustreFS directory I receive lustre cannot change directory to /lustre_homes/: Cannot send after transport endpoint shutdown If I try to mount it, the system tell me that the filesystem is already mounted. If I try to umount the filesystem, the system tell me that the device is busy. I think to wrong something. I''m using lustre 1.6.5 on ScientificLinux 4.7. Some hints? Thanks in advance -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Wed, 2009-06-10 at 15:25 +0200, Enrico Morelli wrote:> > which is the correct sequence for a MDT/MGS server shutdown?Well, if you want to take an entire filesystem down, technically and ideally (there is no strict requirement, just optimum), you unmount in this order: clients, MDT, OSTs.> Because, when I have to shutdown the server, first I try to umount the > lustreFS on all client and after I reboot the server. But the client > keeps a lot of time to umount the filesystem, more than 20 minutes.If all you want to do is simply reboot a server, you can do that without unmounting the clients. When the server comes back, the clients will just resume where they left off. This is called recovery/failover. Lustre was designed to allow servers to reboot while clients are still using them.> Moreover, if during the client umounting I have to poweroff the server, > after the server power up, I''m unable to remount lustreFS on the > clients without reboot each client, because if I try to enter in the > lustreFS directory I receive > > lustre cannot change directory to /lustre_homes/: Cannot send > after transport endpoint shutdownHrm. This might be related to rebooting the server during the client unmount, although that should allowable as well. I know of no specific bugs in this area. Also, I should ask, do you have your targets (OSTs, MDT) configured for failover? The above scenario where you reboot a server and clients simply resume working when it comes back requires that failover be configured for the targets. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090610/60cb9a7b/attachment.bin
On Wed, 10 Jun 2009 12:52:22 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> > Also, I should ask, do you have your targets (OSTs, MDT) configured > for failover? The above scenario where you reboot a server and > clients simply resume working when it comes back requires that > failover be configured for the targets. > > b. >Thanks, you cleared me some things. For the failover, I don''t know very well Lustre, how can do that? -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Thu, 2009-06-11 at 10:09 +0200, Enrico Morelli wrote:> Thanks, you cleared me some things.You are welcome.> For the failover, I don''t know very well Lustre, how can do that?It''s all covered in the operations manual. If you are maintaining a Lustre system, I really do recommend to read the operations manual and keep it handy for reference. It''s at manual.lustre.org. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090611/58ce5e80/attachment.bin