Ms. Megan Larko
2010-Aug-12 16:52 UTC
[Lustre-discuss] Understanding OST recovery_duration info
Hello, I am looking at the status of my running Lustre 1.6.7.2_3 system (upgrade to 1.8.4 within weeks; impetus to further my education). The default timeout value for Lustre is 100 sec. The default recovery time is 2x timeout value. So I believe our site should have a recovery of basically 200 sec. There are a total of 175 OSTs mounted on approximately 60 OSSes. Because of a hard power failure to the facility (the power went out AND the battery backup completely failed AND the generator was flakey) the linux 2.6.16.60-0.42.9 SLES10SP3 system was booted from a no-power state. Lustre worked and the file system recovered just fine. For education, the value for "recovery_duration" in /proc/fs/lustre/obdfilter/{ost name}/recovery_status file is between 300 and 600. Does this mean that the actual recovery took between 300 and 600 seconds to successfully complete? If yes, should the Lustre timeout default value be higher? Is all of this moot under Lustre 1.8.4 and adaptive timeouts? I appreciate the time taken to enlighten me. Smile! Cheers! M Larko
Andreas Dilger
2010-Aug-12 18:01 UTC
[Lustre-discuss] Understanding OST recovery_duration info
On 2010-08-12, at 10:52, Ms. Megan Larko wrote:> I am looking at the status of my running Lustre 1.6.7.2_3 system > (upgrade to 1.8.4 within weeks; impetus to further my education). > > The default timeout value for Lustre is 100 sec. The default > recovery time is 2x timeout value. So I believe our site should have > a recovery of basically 200 sec. There are a total of 175 OSTs > mounted on approximately 60 OSSes. Because of a hard power failure to > the facility (the power went out AND the battery backup completely > failed AND the generator was flakey) the linux 2.6.16.60-0.42.9 > SLES10SP3 system was booted from a no-power state. > > Lustre worked and the file system recovered just fine. For education, > the value for "recovery_duration" in /proc/fs/lustre/obdfilter/{ost > name}/recovery_status file is between 300 and 600. Does this mean > that the actual recovery took between 300 and 600 seconds to > successfully complete?Right. That is the actual total recovery time.> If yes, should the Lustre timeout default value be higher?No, because even though the base timeout is 100s, there are reasons to extend the recovery window (e.g. new clients continuing to connect will indicate to the OSS that there may still be more missing clients having trouble connecting for some reason).> Is all of this moot under Lustre 1.8.4 and adaptive timeouts?Not totally. There is no fixed recovery window in 1.8, but the basic concepts are largely the same. There will be an adaptive timeout during operation (between 5-900s by default) that will be used as the base number when recovery starts. In my simple 3-client 1-server (MDT + 5 OST) home system, I had completed Lustre recovery after hard server reboot in 27s (not counting server restart time), so AT can definitely make a difference. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.