Roger Spellman
2009-Jan-20 23:05 UTC
[Lustre-discuss] Recovery fails if clients not connected
I have 2 MDS, configured as an active/standby pair. I have 5 OSTs that are NOT active/standby. I have 5 clients. I am using Lustre 1.6.5, due to bug 18232 <https://bugzilla.lustre.org/show_bug.cgi?id=18232> which only affects 1.6.6. Using Lustre 1.6.5, when I reset my active node, the standby takes over. This is quite reliable. Today, I did the following in this order: Unmounted all the clients Rebooted all the clients Stopped Linux HA from running Unmounted the OSTs Unmounted the MDS Rebooted the OSTs Rebooted both MDSes When the MDSes started up, Linux HA chose one to be active. That system mounted the MDT. I looked at the file /proc/fs/lustre/mds/tacc-MDT0000/recovery_status, and it showed: [root at ts-tacc-01 ~]# cat /proc/fs/lustre/mds/tacc-MDT0000/recovery_status status: RECOVERING recovery_start: 0 time_remaining: 0 connected_clients: 0/5 completed_clients: 0/5 replayed_requests: 0/?? queued_requests: 0 next_transno: 17768 ***** Note that recovery_start and time_remaining are both zero. ***** I waited a several minutes, and this file was the same. I was waiting for recovery to complete before trying to mount the OSTs. However, it appears that this would never occur! Does this look like a bug? --------------------------- I format my MDT using the following command. The command is run from 10.2.43.1, and the failnode is 10.2.43.2: mkfs.lustre --reformat --fsname tacc --mdt --mgs --device-size=10000000 --mkfsoptions='' -m 0 -O mmp'' --failnode=10.2.43.2 at o2ib0 /dev/sdb I format the OSTs using the following command: /usr/bin/time -p mkfs.lustre --reformat --ost --mkfsoptions=''-J device=/dev/sdc1 -m 0'' --fsname tacc --device-size=400000000 --mgsnode=10.2.43.1 at o2ib0 --mgsnode=10.2.43.2 at o2ib0 /dev/sdb I mount the clients using: mount -t lustre 10.2.43.1 at o2ib:10.2.43.2 at o2ib:/tacc /mnt/lustre -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090120/389cbd26/attachment.html
Klaus Steden
2009-Jan-21 01:13 UTC
[Lustre-discuss] Recovery fails if clients not connected
Hi Roger, I believe you can connect the OSSs once the MDS has booted, and in fact, I?m pretty sure that the five in the ?connected_clients: 0/5? are in fact your OSS nodes. Each OST maintains a connection to the MDS while the file system is mounted, so they will be included in the connection count on the MDS. However, regardless of the state ? if your MDS is online and the MDT is mounted, you can start up the OSS nodes and corresponding OSTs at any time; clients attempting to make transactions will have their I/O operations block (or fail, depending on the MDS config) until the missing nodes come back online. hth, Klaus On 1/20/09 3:05 PM, "Roger Spellman" <roger at terascala.com> etched on stone tablets:> I have 2 MDS, configured as an active/standby pair. I have 5 OSTs that are > NOT active/standby. I > have 5 clients. > > I am using Lustre 1.6.5, due to bug 18232 > <https://bugzilla.lustre.org/show_bug.cgi?id=18232> which only affects 1.6.6. > Using Lustre 1.6.5, when I > reset my active node, the standby takes over. This is quite reliable. > > Today, I did the following in this order: > Unmounted all the clients > Rebooted all the clients > Stopped Linux HA from running > Unmounted the OSTs > Unmounted the MDS > Rebooted the OSTs > Rebooted both MDSes > > When the MDSes started up, Linux HA chose one to be active. That system > mounted the MDT. > > I looked at the file /proc/fs/lustre/mds/tacc-MDT0000/recovery_status, and it > showed: > > [root at ts-tacc-01 ~]# cat /proc/fs/lustre/mds/tacc-MDT0000/recovery_status > status: RECOVERING > recovery_start: 0 > time_remaining: 0 > connected_clients: 0/5 > completed_clients: 0/5 > replayed_requests: 0/?? > queued_requests: 0 > next_transno: 17768 > > > ***** Note that recovery_start and time_remaining are both zero. ***** > > I waited a several minutes, and this file was the same. > > I was waiting for recovery to complete before trying to mount the OSTs. > However, it appears that > this would never occur! > > Does this look like a bug? > > --------------------------- > > I format my MDT using the following command. The command is run from > 10.2.43.1, and the failnode > is 10.2.43.2: > > mkfs.lustre --reformat --fsname tacc --mdt --mgs --device-size=10000000 > --mkfsoptions='' -m 0 -O > mmp'' --failnode=10.2.43.2 at o2ib0 /dev/sdb > > I format the OSTs using the following command: > > /usr/bin/time -p mkfs.lustre --reformat --ost --mkfsoptions=''-J > device=/dev/sdc1 -m 0'' --fsname > tacc --device-size=400000000 --mgsnode=10.2.43.1 at o2ib0 > --mgsnode=10.2.43.2 at o2ib0 /dev/sdb > > I mount the clients using: > > mount -t lustre 10.2.43.1 at o2ib:10.2.43.2 at o2ib:/tacc /mnt/lustre > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090120/a925d7f8/attachment-0001.html
Andreas Dilger
2009-Jan-21 05:41 UTC
[Lustre-discuss] Recovery fails if clients not connected
On Jan 20, 2009 18:05 -0500, Roger Spellman wrote:> When the MDSes started up, Linux HA chose one to be active. That system > mounted the MDT. > > I looked at the file /proc/fs/lustre/mds/tacc-MDT0000/recovery_status, > and it showed: > > [root at ts-tacc-01 ~]# cat > /proc/fs/lustre/mds/tacc-MDT0000/recovery_status > status: RECOVERING > recovery_start: 0 > time_remaining: 0 > connected_clients: 0/5 > completed_clients: 0/5 > replayed_requests: 0/?? > queued_requests: 0 > next_transno: 17768 > > > ***** Note that recovery_start and time_remaining are both zero. ***** > > I waited a several minutes, and this file was the same. > > I was waiting for recovery to complete before trying to mount the OSTs. > However, it appears that this would never occur! > > Does this look like a bug?No, this is intentional. It is to avoid the situation where the MDS is having network problems and a sysadmin might reboot the MDS to try and resolve the problem. The MDS will not begin recovery until at least one of the clients connects to the MDS. If you want to abort recovery without the clients being present you can run "lctl --device ${mds_device} abort_recovery" on the MDS. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Roger Spellman
2009-Jan-21 17:13 UTC
[Lustre-discuss] Recovery fails if clients not connected
> I believe you can connect the OSSs once the MDS has booted, and infact, I''m pretty> sure that the five in the ''connected_clients: 0/5'' are in fact yourOSS nodes. Each> OST maintains a connection to the MDS while the file system ismounted, so they> will be included in the connection count on the MDS.Klaus, thanks for this information. But, the 0/5 is the number of clients. An MDT is a client of an OST, not the other way around. ________________________________ From: Klaus Steden [mailto:klaus.steden at technicolor.com] Sent: Tuesday, January 20, 2009 8:13 PM To: Roger Spellman; lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] Recovery fails if clients not connected Hi Roger, I believe you can connect the OSSs once the MDS has booted, and in fact, I''m pretty sure that the five in the ''connected_clients: 0/5'' are in fact your OSS nodes. Each OST maintains a connection to the MDS while the file system is mounted, so they will be included in the connection count on the MDS. However, regardless of the state - if your MDS is online and the MDT is mounted, you can start up the OSS nodes and corresponding OSTs at any time; clients attempting to make transactions will have their I/O operations block (or fail, depending on the MDS config) until the missing nodes come back online. hth, Klaus -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090121/936774ba/attachment.html
Klaus Steden
2009-Jan-21 19:42 UTC
[Lustre-discuss] Recovery fails if clients not connected
Hrm, I think you?re right, I?m getting fuzzy on some of the details. Andreas? posting explained it pretty clearly ... although I?ve apparently deleted somewhat foolishly. If you still have a copy of that message, would you mind forwarding it to me? cheers, Klaus On 1/21/09 9:13 AM, "Roger Spellman" <roger at terascala.com> etched on stone tablets:>> > I believe you can connect the OSSs once the MDS has booted, and in fact, >> I?m pretty >> > sure that the five in the ?connected_clients: 0/5? are in fact your OSS >> nodes. Each >> > OST maintains a connection to the MDS while the file system is mounted, so >> they >> > will be included in the connection count on the MDS. > > Klaus, thanks for this information. But, the 0/5 is the number of clients. An > MDT is a client of an OST, not the other way around. > > > > From: Klaus Steden [mailto:klaus.steden at technicolor.com] > Sent: Tuesday, January 20, 2009 8:13 PM > To: Roger Spellman; lustre-discuss at lists.lustre.org > Subject: Re: [Lustre-discuss] Recovery fails if clients not connected > > > Hi Roger, > > I believe you can connect the OSSs once the MDS has booted, and in fact, I?m > pretty sure that the five in the ?connected_clients: 0/5? are in fact your OSS > nodes. Each OST maintains a connection to the MDS while the file system is > mounted, so they will be included in the connection count on the MDS. > > However, regardless of the state ? if your MDS is online and the MDT is > mounted, you can start up the OSS nodes and corresponding OSTs at any time; > clients attempting to make transactions will have their I/O operations block > (or fail, depending on the MDS config) until the missing nodes come back > online. > > hth, > Klaus > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090121/667e276c/attachment.html
Andreas Dilger
2009-Jan-21 22:04 UTC
[Lustre-discuss] Recovery fails if clients not connected
On Jan 21, 2009 15:22 -0500, Roger Spellman wrote:> If I say > > lctl --device /dev/sdb abort_recovery > > I get an error. Is it supposed to be a device number? Where do I get > that number?Use "lctl dl" to get the lustre device number, or you can specify it by the Lustre device name given by "lctl dl" as well.> > -----Original Message----- > > From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On Behalf > Of > > Andreas Dilger > > Sent: Wednesday, January 21, 2009 12:41 AM > > To: Roger Spellman > > Cc: lustre-discuss at lists.lustre.org > > Subject: Re: [Lustre-discuss] Recovery fails if clients not connected > > > > On Jan 20, 2009 18:05 -0500, Roger Spellman wrote: > > > When the MDSes started up, Linux HA chose one to be active. That > system > > > mounted the MDT. > > > > > > I looked at the file > /proc/fs/lustre/mds/tacc-MDT0000/recovery_status, > > > and it showed: > > > > > > [root at ts-tacc-01 ~]# cat > > > /proc/fs/lustre/mds/tacc-MDT0000/recovery_status > > > status: RECOVERING > > > recovery_start: 0 > > > time_remaining: 0 > > > connected_clients: 0/5 > > > completed_clients: 0/5 > > > replayed_requests: 0/?? > > > queued_requests: 0 > > > next_transno: 17768 > > > > > > > > > ***** Note that recovery_start and time_remaining are both zero. > ***** > > > > > > I waited a several minutes, and this file was the same. > > > > > > I was waiting for recovery to complete before trying to mount the > OSTs. > > > However, it appears that this would never occur! > > > > > > Does this look like a bug? > > > > No, this is intentional. It is to avoid the situation where the MDS > > is having network problems and a sysadmin might reboot the MDS to try > > and resolve the problem. The MDS will not begin recovery until at > > least one of the clients connects to the MDS. > > > > If you want to abort recovery without the clients being present you > > can run "lctl --device ${mds_device} abort_recovery" on the MDS. > > > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc.Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.