Edward Walter
2010-Aug-09 13:44 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
Hello List, We recently experienced a power failure (and subsequent UPS failure) which caused our Lustre filesystem to shutdown hard. We were able to bring it back online but started seeing errors where the OSTs were being remounted as read-only. We observed that all of the read-only OSTs were reporting an I/O error on the same block (the MMP block) and generating the following message:> Lustre: Server data-OST0004 on device /dev/sdd has started > end_request: I/O error, dev sdd, sector 861112 > Buffer I/O error on device sdd, logical block 107639 > lost page write due to I/O error on sdd > LDISKFS-fs error (device sdd): kmmpd: Error writing to MMP block > end_request: I/O error, dev sdd, sector 0 > Buffer I/O error on device sdd, logical block 0 > lost page write due to I/O error on sdd > LDISKFS-fs warning (device sdd): kmmpd: kmmpd being stopped since > filesystem has been remounted as readonly. > end_request: I/O error, dev sdd, sector 861112 > Buffer I/O error on device sdd, logical block 107639 > lost page write due to I/O error on sddWe do have our OSTs setup for failover but were managing the access through the shared RAID array itself (using LUN fencing) so we don''t need the MMP feature. We disabled MMP using tune2fs (tune2fs -O ^mmp /dev/sdd) on one set of OSTs. When we tried to mount these OSTs we received a message that the volume could not be mounted because MMP was not enabled. We subsequently re-enabled MMP (tune2fs -O mmp /dev/sdd). Oddly this did not return a message indicating the MMP interval or block number. Running ''tune2fs -l'' indicates that MPP is enabled on the volume though. We also observed that OST volumes we disabled MMP on are now indicating that MMP is enabled even though we did not re-enable it. At this point; we can mount the OST targets using ldiskfs in read-only mode. When we attempt to mount them as part of a lustre volume we get the following error: Aug 9 09:25:53 oss-0-25 kernel: LDISKFS-fs warning (device sdd): ldiskfs_multi_mount_protect: fsck is running on the filesystem Aug 9 09:25:53 oss-0-25 kernel: LDISKFS-fs warning (device sdd): ldiskfs_multi_mount_protect: MMP failure info: last update time: 1280954496, last update node: oss-0-25, last update device: /dev/sdd We''re not sure how to proceed at this point. It seems like all of the filesystem objects are present (df reports correct numbers). Has anyone seen this before and worked their way through getting things back online? Note: Lustre version = 1.6.6 (using Sun''s RPMs) OS = Centos 5.2 Kernel = 2.6.18-92.1.10.el5_lustre.1.6.6smp Thanks much. -Ed Walter Carnegie Mellon University
Ken Hornstein
2010-Aug-09 13:53 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
>We recently experienced a power failure (and subsequent UPS failure) >which caused our Lustre filesystem to shutdown hard. We were able to >bring it back online but started seeing errors where the OSTs were being >remounted as read-only. We observed that all of the read-only OSTs were >reporting an I/O error on the same block (the MMP block) and generating >the following message: >[...]I had a similar issue once, but the issue was tha the MMP block was corrupted. What finally fixed it was running tune2fs -E clear-mmp. Maybe that might solve the problem? --Ken
Edward Walter
2010-Aug-09 14:57 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
Hi Ken, Thanks for the tip. This gives me an MMP error though: [root at oss-0-25 log]# tune2fs -E clear-mmp /dev/sdd tune2fs 1.40.11.sun1 (17-June-2008) tune2fs: MMP: appears fsck currently being run on the filesystem while trying to open /dev/sdd Couldn''t find valid filesystem superblock. At the risk of being obvious; we''re not running any kind of fsck operation on this volume. Also, I can still mount this volume read-only using ldiskfs as the filesystem type (so I''m suspicious of the filesystem superblock message). Thanks. -Ed Ken Hornstein wrote:>> We recently experienced a power failure (and subsequent UPS failure) >> which caused our Lustre filesystem to shutdown hard. We were able to >> bring it back online but started seeing errors where the OSTs were being >> remounted as read-only. We observed that all of the read-only OSTs were >> reporting an I/O error on the same block (the MMP block) and generating >> the following message: >> [...] >> > > I had a similar issue once, but the issue was tha the MMP block was > corrupted. What finally fixed it was running tune2fs -E clear-mmp. > Maybe that might solve the problem? > > --Ken > >
Ken Hornstein
2010-Aug-09 15:03 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
>This gives me an MMP error though: >[root at oss-0-25 log]# tune2fs -E clear-mmp /dev/sdd >tune2fs 1.40.11.sun1 (17-June-2008) >tune2fs: MMP: appears fsck currently being run on the filesystem while >trying to open /dev/sdd >Couldn''t find valid filesystem superblock.Oh, I forgot ... did you try adding the -f flag? E.g.: # tune2fs -f -E clear-mmp /dev/sdd According to the tune2fs man page, when you use clear-mmp, you also need the -f flag. Still being able to mount the filesystm read-only would make sense to me, since that wouldn''t affect fsck being run. --Ken
Edward Walter
2010-Aug-09 15:12 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault: tune2fs 1.40.11.sun1 (17-June-2008) Bad options specified. Extended options are separated by commas, and may take an argument which is set off by an equals (''='') sign. Valid extended options are: stride=<RAID per-disk chunk size in blocks> stripe-width=<RAID stride*data disks in blocks> test_fs ^test_fs Segmentation fault Did you use a newer version of tune2fs/e2fsprogs? Our current version is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up versions on e2fsprogs while running an older lustre kernel revision (1.6.6)? Thanks again. -Ed Ken Hornstein wrote:>> This gives me an MMP error though: >> [root at oss-0-25 log]# tune2fs -E clear-mmp /dev/sdd >> tune2fs 1.40.11.sun1 (17-June-2008) >> tune2fs: MMP: appears fsck currently being run on the filesystem while >> trying to open /dev/sdd >> Couldn''t find valid filesystem superblock. >> > > Oh, I forgot ... did you try adding the -f flag? E.g.: > > # tune2fs -f -E clear-mmp /dev/sdd > > According to the tune2fs man page, when you use clear-mmp, you also need > the -f flag. Still being able to mount the filesystm read-only would > make sense to me, since that wouldn''t affect fsck being run. > > --Ken > >
Andreas Dilger
2010-Aug-09 15:15 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
On 2010-08-09, at 11:12, Edward Walter wrote:> Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault: > tune2fs 1.40.11.sun1 (17-June-2008) > > Bad options specified. > > Extended options are separated by commas, and may take an argument which > is set off by an equals (''='') sign. > > Valid extended options are: > stride=<RAID per-disk chunk size in blocks> > stripe-width=<RAID stride*data disks in blocks> > test_fs > ^test_fs > Segmentation fault > > Did you use a newer version of tune2fs/e2fsprogs? Our current version is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up versions on e2fsprogs while running an older lustre kernel revision (1.6.6)?Running newer e2fsprogs is OK, and in fact a lot of issues w.r.t. MMP were fixed in newer releases.> Ken Hornstein wrote: >>> This gives me an MMP error though: >>> [root at oss-0-25 log]# tune2fs -E clear-mmp /dev/sdd >>> tune2fs 1.40.11.sun1 (17-June-2008) >>> tune2fs: MMP: appears fsck currently being run on the filesystem while >>> trying to open /dev/sdd >>> Couldn''t find valid filesystem superblock. >>> >> >> Oh, I forgot ... did you try adding the -f flag? E.g.: >> >> # tune2fs -f -E clear-mmp /dev/sdd >> >> According to the tune2fs man page, when you use clear-mmp, you also need >> the -f flag. Still being able to mount the filesystm read-only would >> make sense to me, since that wouldn''t affect fsck being run. >> >> --Ken >> >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Ken Hornstein
2010-Aug-09 15:22 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
>Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault:Ewww .... well, not sure what to tell you about that.>Did you use a newer version of tune2fs/e2fsprogs? Our current version >is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up >versions on e2fsprogs while running an older lustre kernel revision (1.6.6)?I am using e2fsprogs-1.41.6.sun1-0suse ... and I know that is old. I was going to say that I don''t know if revving up e2fsprogs is okay, but I see that Andreas already answered that one. I can''t be 100% sure that upgrading e2fsprogs _will_ solve your problem, but I think it''s worth a shot. --Ken
laotsao 老曹
2010-Aug-09 15:49 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
hi I did go through various lustre download it seems that 1.6.7 1.8 1.8.0.1 all has e2fsprogs-1.40.11-sun1 the 1.8.1.1 has 1.41.6.sun1, hope that this version is good for Ur centos 5.2 and kernel version regards On 8/9/2010 11:22 AM, Ken Hornstein wrote:>> Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault: > Ewww .... well, not sure what to tell you about that. > >> Did you use a newer version of tune2fs/e2fsprogs? Our current version >> is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up >> versions on e2fsprogs while running an older lustre kernel revision (1.6.6)? > I am using e2fsprogs-1.41.6.sun1-0suse ... and I know that is old. > > I was going to say that I don''t know if revving up e2fsprogs is okay, but > I see that Andreas already answered that one. I can''t be 100% sure that > upgrading e2fsprogs _will_ solve your problem, but I think it''s worth > a shot. > > --Ken > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: laotsao.vcf Type: text/x-vcard Size: 139 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100809/394e1e07/attachment-0001.vcf
laotsao 老曹
2010-Aug-09 16:47 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
http://downloads.lustre.org/public/tools/e2fsprogs/ On 8/9/2010 11:49 AM, laotsao ?? wrote:> > hi > I did go through various lustre download > it seems that 1.6.7 1.8 1.8.0.1 all has e2fsprogs-1.40.11-sun1 > the 1.8.1.1 has 1.41.6.sun1, hope that this version is good for Ur > centos 5.2 and > kernel version > > regards > > On 8/9/2010 11:22 AM, Ken Hornstein wrote: >>> Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault: >> Ewww .... well, not sure what to tell you about that. >> >>> Did you use a newer version of tune2fs/e2fsprogs? Our current version >>> is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up >>> versions on e2fsprogs while running an older lustre kernel revision >>> (1.6.6)? >> I am using e2fsprogs-1.41.6.sun1-0suse ... and I know that is old. >> >> I was going to say that I don''t know if revving up e2fsprogs is okay, >> but >> I see that Andreas already answered that one. I can''t be 100% sure that >> upgrading e2fsprogs _will_ solve your problem, but I think it''s worth >> a shot. >> >> --Ken >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: laotsao.vcf Type: text/x-vcard Size: 139 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100809/e69c9de6/attachment.vcf
Edward Walter
2010-Aug-09 18:11 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
Ok, we''re making progress. We updated our e2fsprogs to e2fsprogs-1.41.10.sun2-0redhat. That let us clear the MMP blocks and mount our OSTs as part of our lustre volume. :) We''re continuing to test things and seeing weird behavior when we run an ost-survey though. It looks as though the lustre client is getting shuffled back and forth between OSS server pairs for our OSTs. The client times out connecting to the primary server, attempts to connect to the failover server (and fails because the OST is on the primary) and then reconnects to the primary server and finishes the survey. This behavior is not isolated to one particular OST (or client) and doesn''t occur with every survey. ### Here''s an example of the error we see on the client when this occurs: [root at compute-2-7 ~]# lfs check servers data-MDT0000-mdc-ffff81041f9b2c00 active. data-OST0000-osc-ffff81041f9b2c00 active. data-OST0001-osc-ffff81041f9b2c00 active. data-OST0002-osc-ffff81041f9b2c00 active. data-OST0003-osc-ffff81041f9b2c00 active. data-OST0004-osc-ffff81041f9b2c00 active. data-OST0005-osc-ffff81041f9b2c00 active. data-OST0006-osc-ffff81041f9b2c00 active. data-OST0007-osc-ffff81041f9b2c00 active. data-OST0008-osc-ffff81041f9b2c00 active. data-OST0009-osc-ffff81041f9b2c00 active. data-OST000a-osc-ffff81041f9b2c00 active. error: check ''data-OST000b-osc-ffff81041f9b2c00'': Resource temporarily unavailable (11) ### and here''s the relevant dmesg info: [root at compute-2-7 ~]# dmesg |grep Lustre Lustre: Client data-client has started Lustre: Request x121943 sent from data-OST000b-osc-ffff81041f9b2c00 to NID 172.16.1.25 at o2ib 100s ago has timed out (limit 100s). Lustre: Skipped 1 previous similar message Lustre: data-OST000b-osc-ffff81041f9b2c00: Connection to service data-OST000b via nid 172.16.1.25 at o2ib was lost; in progress operations using this service will wait for recovery to complete. Lustre: Skipped 3 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.16.1.25 at o2ib. The ost_connect operation failed with -16 LustreError: Skipped 11 previous similar messages Lustre: Changing connection for data-OST000b-osc-ffff81041f9b2c00 to 172.16.1.23 at o2ib/172.16.1.23 at o2ib Lustre: Skipped 11 previous similar messages Lustre: 4264:0:(import.c:410:import_select_connection()) data-OST000b-osc-ffff81041f9b2c00: tried all connections, increasing latency to 6s Lustre: 4264:0:(import.c:410:import_select_connection()) Skipped 4 previous similar messages LustreError: 11-0: an error occurred while communicating with 172.16.1.25 at o2ib. The ost_connect operation failed with -16 LustreError: Skipped 1 previous similar message Lustre: Changing connection for data-OST000b-osc-ffff81041f9b2c00 to 172.16.1.23 at o2ib/172.16.1.23 at o2ib Lustre: Skipped 1 previous similar message Lustre: 4264:0:(import.c:410:import_select_connection()) data-OST000b-osc-ffff81041f9b2c00: tried all connections, increasing latency to 11s Lustre: data-OST000b-osc-ffff81041f9b2c00: Connection restored to service data-OST000b using nid 172.16.1.25 at o2ib. Lustre: Skipped 1 previous similar message ### the ost-survey completes but it''s obvious that something''s not right: [root at compute-2-7 ~]# ost-survey -s 50 /lustre/ /usr/bin/ost-survey: 08/09/10 OST speed survey on /lustre/ from 172.16.255.223 at o2ib Number of Active OST devices : 12 Worst Read OST indx: 11 speed: 2.449542 Best Read OST indx: 3 speed: 2.512130 Read Average: 2.480302 +/- 0.018453 MB/s Worst Write OST indx: 11 speed: 0.209190 Best Write OST indx: 4 speed: 5.595996 Write Average: 4.223409 +/- 2.038925 MB/s Ost# Read(MB/s) Write(MB/s) Read-time Write-time ---------------------------------------------------- 0 2.481 5.527 20.152 9.046 1 2.464 5.484 20.294 9.118 2 2.492 5.559 20.067 8.994 3 2.512 4.413 19.903 11.330 4 2.476 5.596 20.190 8.935 5 2.485 5.444 20.117 9.184 6 2.499 5.525 20.005 9.050 7 2.468 1.387 20.260 36.047 8 2.494 5.468 20.047 9.144 9 2.491 5.398 20.071 9.263 10 2.451 0.671 20.400 74.568 11 2.450 0.209 20.412 239.017 ### Sorry for the wall of text here and thanks for the help everyone. -Ed laotsao ?? wrote:> http://downloads.lustre.org/public/tools/e2fsprogs/ > > > On 8/9/2010 11:49 AM, laotsao ?? wrote: >> >> hi >> I did go through various lustre download >> it seems that 1.6.7 1.8 1.8.0.1 all has e2fsprogs-1.40.11-sun1 >> the 1.8.1.1 has 1.41.6.sun1, hope that this version is good for Ur >> centos 5.2 and >> kernel version >> >> regards >> >> On 8/9/2010 11:22 AM, Ken Hornstein wrote: >>>> Using ''tune2fs -f -E clear-mmp'' causes tune2fs to segfault: >>> Ewww .... well, not sure what to tell you about that. >>> >>>> Did you use a newer version of tune2fs/e2fsprogs? Our current version >>>> is e2fsprogs-1.40.11.sun1-0redhat. Do you know if it''s safe to rev up >>>> versions on e2fsprogs while running an older lustre kernel revision >>>> (1.6.6)? >>> I am using e2fsprogs-1.41.6.sun1-0suse ... and I know that is old. >>> >>> I was going to say that I don''t know if revving up e2fsprogs is >>> okay, but >>> I see that Andreas already answered that one. I can''t be 100% sure that >>> upgrading e2fsprogs _will_ solve your problem, but I think it''s worth >>> a shot. >>> >>> --Ken >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Andreas Dilger
2010-Aug-09 20:13 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
On 2010-08-09, at 14:11, Edward Walter wrote:> We''re continuing to test things and seeing weird behavior when we run an ost-survey though. It looks as though the lustre client is getting > shuffled back and forth between OSS server pairs for our OSTs. The > client times out connecting to the primary server, attempts to connect to the failover server (and fails because the OST is on the primary) and then reconnects to the primary server and finishes the survey. This behavior is not isolated to one particular OST (or client) and doesn''t occur with every survey. > > and here''s the relevant dmesg info: > > [root at compute-2-7 ~]# dmesg |grep Lustre > Lustre: Client data-client has started > Lustre: Request x121943 sent from data-OST000b-osc-ffff81041f9b2c00 to > NID 172.16.1.25 at o2ib 100s ago has timed out (limit 100s). > Lustre: Skipped 1 previous similar messageIf you have a larger cluster (hundreds of clients) with 1.6.6 you have to increase the lustre timeout value beyond 100s for the worst-case IO (300s is pretty typical at 1000 clients), but this is too long for most cases. What you really want is to upgrade to 1.8.x in order to get adaptive timeouts. This allows the clients/servers to handle varying network and storage latency, instead of having a fixed timeout. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Edward Walter
2010-Aug-09 20:32 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
Andreas Dilger wrote:> On 2010-08-09, at 14:11, Edward Walter wrote: > >> We''re continuing to test things and seeing weird behavior when we run an ost-survey though. It looks as though the lustre client is getting >> shuffled back and forth between OSS server pairs for our OSTs. The >> client times out connecting to the primary server, attempts to connect to the failover server (and fails because the OST is on the primary) and then reconnects to the primary server and finishes the survey. This behavior is not isolated to one particular OST (or client) and doesn''t occur with every survey. >> >> and here''s the relevant dmesg info: >> >> [root at compute-2-7 ~]# dmesg |grep Lustre >> Lustre: Client data-client has started >> Lustre: Request x121943 sent from data-OST000b-osc-ffff81041f9b2c00 to >> NID 172.16.1.25 at o2ib 100s ago has timed out (limit 100s). >> Lustre: Skipped 1 previous similar message >> > > If you have a larger cluster (hundreds of clients) with 1.6.6 you have to increase the lustre timeout value beyond 100s for the worst-case IO (300s is pretty typical at 1000 clients), but this is too long for most cases. > > What you really want is to upgrade to 1.8.x in order to get adaptive timeouts. This allows the clients/servers to handle varying network and storage latency, instead of having a fixed timeout. > > Cheers, Andreas > -- > Andreas Dilger > Lustre Technical Lead > Oracle Corporation Canada Inc. >Hi Andreas, Our cluster is fairly modest in size (104 clients, 4 OSS, 12 OSTs, 1 active MDS). We have plans for upgrading to 1.8.x but those plans now include stabilizing our 1.6.6 installation so that we can do a full backup before upgrading. For now; we''re doing our testing from 2-3 nodes without any of the other nodes mounting lustre. This configuration was stable and reliable until the hard shutdown. Obviously we''d like to get back to where we were before upgrading. Our timeout on the clients (cat /proc/sys/lustre/timeout) is 100s. Shouldn''t this be sufficient for 2 clients? I think something else is going on. -Ed
laotsao 老曹
2010-Aug-10 12:05 UTC
[Lustre-discuss] OST targets not mountable after disabling/enabling MMP
hi Timeout could due to Ur IB network? it seems that there is not harm just increase timeout from 100s to 200s to see that ost-survey will finish without any error after power outage did U check all FC paths and IB path are all good? my 2c On 8/9/2010 4:32 PM, Edward Walter wrote:> Andreas Dilger wrote: >> On 2010-08-09, at 14:11, Edward Walter wrote: >> >>> We''re continuing to test things and seeing weird behavior when we run an ost-survey though. It looks as though the lustre client is getting >>> shuffled back and forth between OSS server pairs for our OSTs. The >>> client times out connecting to the primary server, attempts to connect to the failover server (and fails because the OST is on the primary) and then reconnects to the primary server and finishes the survey. This behavior is not isolated to one particular OST (or client) and doesn''t occur with every survey. >>> >>> and here''s the relevant dmesg info: >>> >>> [root at compute-2-7 ~]# dmesg |grep Lustre >>> Lustre: Client data-client has started >>> Lustre: Request x121943 sent from data-OST000b-osc-ffff81041f9b2c00 to >>> NID 172.16.1.25 at o2ib 100s ago has timed out (limit 100s). >>> Lustre: Skipped 1 previous similar message >>> >> If you have a larger cluster (hundreds of clients) with 1.6.6 you have to increase the lustre timeout value beyond 100s for the worst-case IO (300s is pretty typical at 1000 clients), but this is too long for most cases. >> >> What you really want is to upgrade to 1.8.x in order to get adaptive timeouts. This allows the clients/servers to handle varying network and storage latency, instead of having a fixed timeout. >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Lustre Technical Lead >> Oracle Corporation Canada Inc. >> > Hi Andreas, > > Our cluster is fairly modest in size (104 clients, 4 OSS, 12 OSTs, 1 > active MDS). We have plans for upgrading to 1.8.x but those plans now > include stabilizing our 1.6.6 installation so that we can do a full > backup before upgrading. > > For now; we''re doing our testing from 2-3 nodes without any of the other > nodes mounting lustre. This configuration was stable and reliable until > the hard shutdown. Obviously we''d like to get back to where we were > before upgrading. Our timeout on the clients (cat > /proc/sys/lustre/timeout) is 100s. Shouldn''t this be sufficient for 2 > clients? I think something else is going on. > > -Ed > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: laotsao.vcf Type: text/x-vcard Size: 139 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100810/94a0fcea/attachment.vcf