Hello! On Wed, May 31, 2006 at 10:50:58AM +0200, Lukasz Skital wrote:> This behavior is not acceptable for my application. I would prefer to > have reasonably short timeout, after which file operation would exit > with some I/O error. > Is this possible to configure lustre in that way?When configuring your OSTs, specify them as failout. (default is failover) Bye, Oleg
Hi all, I am considering use of Lustre in some unstable environment, where OSTs can be accidentally turned off. I have tested lustre and find out following behavior: When one of OSTs is unavailable, filesystem is almost unusable, because operation on unavailable files ends with infinite timeout. Of course, when missing OST is back online, operation resumes. This behavior is not acceptable for my application. I would prefer to have reasonably short timeout, after which file operation would exit with some I/O error. Is this possible to configure lustre in that way? Best regards. -- ?ukasz Skita? <l.skital@cyfronet.pl> GG: 1279114
Hello! On Wed, May 31, 2006 at 11:15:59AM +0200, Lukasz Skital wrote:> I tried config with failout, but I do not see any difference between > failover and failout. In both cases file operations hags without > timeout. I was looking through documentation and find, that it can be > related with lov, but documentation seems to be somewhat incomplete.Did you do reformat or at least lconf --write_conf every time you changed test_conf.xml ? Bye, Oleg
Thank You for answer. I tried config with failout, but I do not see any difference between failover and failout. In both cases file operations hags without timeout. I was looking through documentation and find, that it can be related with lov, but documentation seems to be somewhat incomplete. My test config is as follow: lmc -m test_conf.xml --add node --node hostname0 --timeout 5 lmc -m test_conf.xml --add net --node hostname0 --nid hostname0 --nettype tcp lmc -m test_conf.xml --add node --node hostname1 --timeout 5 lmc -m test_conf.xml --add net --node hostname1 --nid hostname1 --nettype tcp # Configure MDS lmc -m test_conf.xml --format --add mds --node hostname0 --mds mds-test --fstype ext3 --dev /tmp/mds-test --size 1000000 # Configure OSTs lmc -m test_conf.xml --add lov --lov lov-test --mds mds-test lmc -m test_conf.xml --add ost --node hostname0 --failout --lov lov-test --ost ost1-test --fstype ext3 --dev /dev/sda4 lmc -m test_conf.xml --add ost --node hostname1 --failout --lov lov-test --ost ost2-test --fstype ext3 --dev /dev/sda4 # Configure client lmc -m test_conf.xml --add mtpt --node hostname0 --path /mnt/lustre --mds mds-test --lov lov-test --clientoptions sync lmc -m test_conf.xml --add mtpt --node hostname1 --path /mnt/lustre --mds mds-test --lov lov-test --clientoptions sync Best regards, Luksz On 5/31/06, Oleg Drokin <green@clusterfs.com> wrote:> Hello! > > On Wed, May 31, 2006 at 10:50:58AM +0200, Lukasz Skital wrote: > > > This behavior is not acceptable for my application. I would prefer to > > have reasonably short timeout, after which file operation would exit > > with some I/O error. > > Is this possible to configure lustre in that way? > > When configuring your OSTs, specify them as failout. (default is failover) > > Bye, > Oleg >-- ?ukasz Skita? <l.skital@cyfronet.pl> GG: 1279114
Hi, I did reformat. My test routine is following: # find . <lists all files> Turn off one OST (by iptables rule) # find . <lists one file and hangs> Best regards, Lukasz On 5/31/06, Oleg Drokin <green@clusterfs.com> wrote:> Hello! > > On Wed, May 31, 2006 at 11:15:59AM +0200, Lukasz Skital wrote: > > I tried config with failout, but I do not see any difference between > > failover and failout. In both cases file operations hags without > > timeout. I was looking through documentation and find, that it can be > > related with lov, but documentation seems to be somewhat incomplete. > > Did you do reformat or at least lconf --write_conf every time you changed > test_conf.xml ? > > Bye, > Oleg >-- ?ukasz Skita? <l.skital@cyfronet.pl> GG: 1279114
Lukasz Skital wrote:> Hi, > > I did reformat. My test routine is following: > > # find . > <lists all files> > > Turn off one OST (by iptables rule)What do you mean by this? What happens if you turn off one OST with ''/sbin/shutdown -h 0'' followed by a power-off? cliffw> > # find . > <lists one file and hangs> > > Best regards, > Lukasz > > On 5/31/06, Oleg Drokin <green@clusterfs.com> wrote: > >> Hello! >> >> On Wed, May 31, 2006 at 11:15:59AM +0200, Lukasz Skital wrote: >> > I tried config with failout, but I do not see any difference between >> > failover and failout. In both cases file operations hags without >> > timeout. I was looking through documentation and find, that it can be >> > related with lov, but documentation seems to be somewhat incomplete. >> >> Did you do reformat or at least lconf --write_conf every time you changed >> test_conf.xml ? >> >> Bye, >> Oleg >> > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Hi, On 5/31/06, Cliff White <cliffw@clusterfs.com> wrote:> Lukasz Skital wrote: > > Hi, > > > > I did reformat. My test routine is following: > > > > # find . > > <lists all files> > > > > Turn off one OST (by iptables rule) > > What do you mean by this? What happens if you turn off one OST with > ''/sbin/shutdown -h 0'' followed by a power-off? > cliffwI added iptables rule on OST which drops all traffic. This is to mimic hardware shutdown. Best regards, Lukasz> > > > > # find . > > <lists one file and hangs> > > > > Best regards, > > Lukasz > > > > On 5/31/06, Oleg Drokin <green@clusterfs.com> wrote: > > > >> Hello! > >> > >> On Wed, May 31, 2006 at 11:15:59AM +0200, Lukasz Skital wrote: > >> > I tried config with failout, but I do not see any difference between > >> > failover and failout. In both cases file operations hags without > >> > timeout. I was looking through documentation and find, that it can be > >> > related with lov, but documentation seems to be somewhat incomplete. > >> > >> Did you do reformat or at least lconf --write_conf every time you changed > >> test_conf.xml ? > >> > >> Bye, > >> Oleg > >> > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss@clusterfs.com > > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >-- ?ukasz Skita? <l.skital@cyfronet.pl> GG: 1279114
Hello! On Wed, May 31, 2006 at 11:38:13AM +0200, Lukasz Skital wrote:> I did reformat. My test routine is following: > # find . > <lists all files> > Turn off one OST (by iptables rule) > # find . > <lists one file and hangs>How long did you wait? Did you see in the logs that OST was marked invalid? You should receive EIO at this point. And then everything should work ok except all access to failed OST should return with EIO. Bye, Oleg