Hi, I tell you that I am a novice to Lustre. I have the following Lustre configuration: OSS1 mount -t lustre /dev/sdc /LUSTRE_02 mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02 where /dev/sdc ? a SAN partition of 100GB OSS2 mount -t lustre /dev/sdb /LUSTRE_02 mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02 where /dev/sdb ? a SAN partition of 70GB for a total of a lustre02 filesystem of 170GB I have un client: CLIENT1 mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02 At this point I launch a copy of a large file on lustre02 by CLIENT1, I see what node OSS is used for the copy and shutdown it. The current operation is suspended and here I wonder: how long it hangs ?? I read the manual and I expected that the client, after a bit of time, try another OSS and ended the operation. Obviously I did not understand well. The operation hangs for a long time, until I kill the process !! The section of the manual is: TIPS on tha page 8-23 With this configuration, it''s possible insert a third OSS that do by failover for both OSS ? Thanks Ing. Stefano Elmopi Gruppo Darco - Resp. ICT Sistemi Via Ostiense 131/L Corpo B, 00154 Roma cell. 3466147165 tel. 0657060500 email:stefano.elmopi at sociale.it -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091021/14263f26/attachment.html
On 2009-10-21, at 06:28, Stefano Elmopi wrote:> I tell you that I am a novice to Lustre. > I have the following Lustre configuration: > OSS1 > mount -t lustre /dev/sdc /LUSTRE_02 > mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02 > > where /dev/sdc ? a SAN partition of 100GB > > OSS2 > mount -t lustre /dev/sdb /LUSTRE_02 > mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02You don''t need to mount the client filesystem on top of the server filesystem. This will lead to some confusion.> where /dev/sdb ? a SAN partition of 70GB > > for a total of a lustre02 filesystem of 170GB > > I have un client: > CLIENT1 > mount -t lustre 10.43.105.50 at tcp0:/lustre02 /LUSTRE_02 > > At this point I launch a copy of a large file on lustre02 by CLIENT1, > I see what node OSS is used for the copy and shutdown it. > The current operation is suspended and here I wonder: > how long it hangs ?? > I read the manual and I expected that the client, after a bit of time, > try another OSS and ended the operation. > Obviously I did not understand well.You need to configure failover. Lustre does not know without some external HA software that the two disk devices on the SAN are shared. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.