Sebastian Reitenbach
2010-Feb-01 12:32 UTC
[Lustre-discuss] OST: inactive device, but why?
Hi, I am running lustre in my test setup, and I saw inactive OST device, when I do what I describe below: I run lustre-1.8.1.1 on SLES11 x86_64, on servers and clients. On the MGS/MDS (10.0.0.81) server, I created one mgs and two mdt partitions: mkfs.lustre --mgs --mdt --fsname=foo --reformat --mkfsoptions="-N 500000" /dev/xvdb1 mkfs.lustre --mdt --fsname=bar --mgsnode=10.0.0.81 at tcp --reformat /dev/xvdb2 mount -t lustre /dev/xvdb1 /lustre/foo-mgs-mdt mount -t lustre /dev/xvdb2 /lustre/bar-mdt On the two OSS hosts, I created those storages: ON OST1 (10.0.0.85)host: mke2fs -O journal_dev -b 4096 /dev/xvdd mke2fs -O journal_dev -b 4096 /dev/xvde mkfs.lustre --fsname=foo --param="failover.node=10.0.0.86 at tcp" -- mgsnode=10.0.0.81 at tcp --ost --reformat --mkfsoptions "-j -J device=/dev/xvdd - E stride=32" /dev/xvdb1 mkfs.lustre --fsname=bar --param="failover.node=10.0.0.86 at tcp" -- mgsnode=10.0.0.81 at tcp --ost --reformat --mkfsoptions "-j -J device=/dev/xvde - E stride=32" /dev/xvdb2 ON OST2 (10.0.0.86)host: mke2fs -O journal_dev -b 4096 /dev/xvdf mke2fs -O journal_dev -b 4096 /dev/xvdg mkfs.lustre --fsname=foo --param="failover.node=10.0.0.85 at tcp" -- mgsnode=10.0.0.81 at tcp --ost --reformat --mkfsoptions "-j -J device=/dev/xvdf - E stride=32" /dev/xvdc1 mkfs.lustre --fsname=bar --param="failover.node=10.0.0.85 at tcp" -- mgsnode=10.0.0.81 at tcp --ost --reformat --mkfsoptions "-j -J device=/dev/xvdg - E stride=32" /dev/xvdc2 The devices are shared storage via SAN. I configured pacemaker, using the lustre resource script, to mount and unmount the partitions on the OST hosts. I had both OST hosts in standby mode. I put OST2 (10.0.0.86) in active mode. Then all four partitions got successfully mounted on OST2. Then I put OST1 in active mode too. The shares that belong to OST1, were automatically unmounted from OST2, and successfully mounted on OST1. That happened due to configured constraints in pacemaker. Then I mounted the client filesystems: mount -t lustre 10.0.0.81 at tcp:/foo /lustre/foo mount -t lustre 10.0.0.81 at tcp:/bar /lustre/bar LUSTRE-CLIENT:/lustre # lfs df -h UUID bytes Used Available Use% Mounted on foo-MDT0000_UUID 265.5M 19.1M 220.9M 7% foo[MDT:0] foo-OST0000_UUID : inactive device foo-OST0001_UUID 9.8G 22.7M 9.3G 0% foo[OST:1] filesystem summary: 9.8G 22.7M 9.3G 0% foo LUSTRE-CLIENT:/lustre # cat /proc/fs/lustre/osc/foo-OST0000-osc- ffff88003e695800/ost_conn_uuid 10.0.0.86 at tcp LUSTRE-CLIENT:/lustre # cat /proc/fs/lustre/osc/foo-OST0001-osc- ffff88003e695800/ost_conn_uuid 10.0.0.86 at tcp And then I got that foo-OST0000_UUID : inactive device I unmounted the filesystem from the client, and stopped the OSTs, and MDT/MGS, and remounted everything, starting with the MGS, MDT, OSTs and then on the client. However, I still have the inactive device. Therefore I reformatted everything on the servers like above. Then mounting the MGS/MDTs and then the OSTs. The only difference was how I started the OSTs: Instead of taking them from standby mode online, both were already online, and I only started the lustre resource one after each other. Then the lustre filesystems got mounted on the server where they are intended to run. LUSTRE-CLIENT:/lustre # lfs df -h UUID bytes Used Available Use% Mounted on foo-MDT0000_UUID 265.5M 19.1M 220.9M 7% foo[MDT:0] foo-OST0000_UUID 9.8G 22.7M 9.3G 0% foo[OST:0] foo-OST0001_UUID 9.8G 22.7M 9.3G 0% foo[OST:1] filesystem summary: 19.7G 45.4M 18.6G 0% foo LUSTRE-CLIENT:~ # cat /proc/fs/lustre/osc/foo-OST0000-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.85 at tcp LUSTRE-CLIENT:~ # cat /proc/fs/lustre/osc/foo-OST0001-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.86 at tcp LUSTRE-CLIENT:~ # Afterwards, I can also put one of the OST servers into standby mode, and put it back online, without problem, e.g. after putting OST1 into standby mode: LUSTRE-CLIENT:/lustre/foo # lfs df -h UUID bytes Used Available Use% Mounted on foo-MDT0000_UUID 265.5M 19.1M 220.9M 7% /lustre/foo[MDT:0] foo-OST0000_UUID 9.8G 22.7M 9.3G 0% /lustre/foo[OST:0] foo-OST0001_UUID 9.8G 22.7M 9.3G 0% /lustre/foo[OST:1] filesystem summary: 19.7G 45.4M 18.6G 0% /lustre/foo LUSTRE-CLIENT:/lustre/foo # cat /proc/fs/lustre/osc/foo-OST0001-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.86 at tcp LUSTRE-CLIENT:/lustre/foo # cat /proc/fs/lustre/osc/foo-OST0000-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.86 at tcp and taking it back online: LUSTRE-CLIENT:/lustre/foo # lfs df -h UUID bytes Used Available Use% Mounted on foo-MDT0000_UUID 265.5M 19.1M 220.9M 7% /lustre/foo[MDT:0] foo-OST0001_UUID 9.8G 22.7M 9.3G 0% /lustre/foo[OST:1] filesystem summary: 9.8G 22.7M 9.3G 0% /lustre/foo LUSTRE-CLIENT:/lustre/foo # cat /proc/fs/lustre/osc/foo-OST0001-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.86 at tcp LUSTRE-CLIENT:/lustre/foo # cat /proc/fs/lustre/osc/foo-OST0000-osc- ffff88003e5ca000/ost_conn_uuid 10.0.0.85 at tcp Therefore, some questions came up: Is this expected behaviour when the OST filesystems are mounted the first time, but unfortunately on the wrong server? How can I make the inactive OST active, when such things happened, without reformatting everything? Is there a way how I can prevent this from happening accidentally? regards, Sebastian