Robert Minvielle
2009-Jan-30 21:35 UTC
[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?
I have setup a lustre system for testing consisting of four OST''s and one MDT. It seems to work fine for about a day. At the end of about 24 hours, the clients can no longer read or write the mount point (although a file listing (ls) works). For example, a mkdir yields a "cannot create directory ''/datafs/temp'': Identifier removed", and the temp dir does not exist. A file listing of the /datafs directory comes back complete and correct, but if I try to ls a subdirectory it gives me the erorr "ls: /datafs/test2: Identifier removed". The client is mounting the dir to /datafs. This worked fine eariler, I left for the day, came back in and this error is occurring on all clients (albeit I only have three clients for testing). All clients/servers are running RHEL5, and the lustre was installed via rpms as per the manual. Out of curiosity, if I go to the server and do an ls on /mnt/data/mdt or to the OST server and do an ls on /mnt/data/ost1, I get an error that it is not a directory (although that could be normal, I am not sure). A cat of /proc/fs/lustre/devices on the mdt does not show anything out of place (or at least, it is the same as when I started the lustre and mounted the servers/clients) I have configured it all according to http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustreExamples.html#50548848_pgfId-1286919 as per section 6.1.1.2 Configuration Generation and Application, using one server for the MGT and MDS, and I have four OSTs, just like the example. Has anyone seen this before? Robert
Oleg Drokin
2009-Jan-30 22:20 UTC
[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?
Hello! On Jan 30, 2009, at 4:35 PM, Robert Minvielle wrote:> > I have setup a lustre system for testing consisting of four OST''s > and one > MDT. It seems to work fine for about a day. At the end of about 24 > hours, > the clients can no longer read or write the mount point (although a > file > listing (ls) works). For example, a mkdir yields a "cannot create > directory > ''/datafs/temp'': Identifier removed", and the temp dir does not exist. > A file listing of the /datafs directory comes back complete and > correct, > but if I try to ls a subdirectory it gives me the erorr "ls: /datafs/ > test2: > Identifier removed".That means your /etc/groups file is out of sync between clients and MDS and you have group upcall configured Make them to be identical as the simplest way to fix it, or refer to bug 14756: https://bugzilla.lustre.org/show_bug.cgi?id=14756 for more details. Bye, Oleg
Jeremy Mann
2009-Jan-30 22:41 UTC
[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?
Robert Minvielle wrote:> > I have setup a lustre system for testing consisting of four OST''s and one > MDT. It seems to work fine for about a day. At the end of about 24 hours, > the clients can no longer read or write the mount point (although a file > listing (ls) works). For example, a mkdir yields a "cannot create > directory > ''/datafs/temp'': Identifier removed", and the temp dir does not exist. > A file listing of the /datafs directory comes back complete and correct, > but if I try to ls a subdirectory it gives me the erorr "ls: > /datafs/test2: > Identifier removed".Robert, this happens when your MDT node does not have the same groups file as the OSTs. -- Jeremy Mann jeremy at biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672
Robert Minvielle
2009-Jan-30 22:46 UTC
[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?
Aha. I was searching for the wrong thing on bugzilla. I will correct and retest. Thank you. ----- Original Message ----- From: "Oleg Drokin" <Oleg.Drokin at Sun.COM> To: "Robert Minvielle" <robert at lite3d.com> Cc: lustre-discuss at lists.lustre.org Sent: Friday, January 30, 2009 4:20:01 PM GMT -06:00 US/Canada Central Subject: Re: [Lustre-discuss] lustre no longer allows reads/writes (stopped working)? Hello! On Jan 30, 2009, at 4:35 PM, Robert Minvielle wrote:> > I have setup a lustre system for testing consisting of four OST''s > and one > MDT. It seems to work fine for about a day. At the end of about 24 > hours, > the clients can no longer read or write the mount point (although a > file > listing (ls) works). For example, a mkdir yields a "cannot create > directory > ''/datafs/temp'': Identifier removed", and the temp dir does not exist. > A file listing of the /datafs directory comes back complete and > correct, > but if I try to ls a subdirectory it gives me the erorr "ls: /datafs/ > test2: > Identifier removed".That means your /etc/groups file is out of sync between clients and MDS and you have group upcall configured Make them to be identical as the simplest way to fix it, or refer to bug 14756: https://bugzilla.lustre.org/show_bug.cgi?id=14756 for more details. Bye, Oleg
Arden Wiebe
2009-Jan-30 23:15 UTC
[Lustre-discuss] lustre no longer allows reads/writes (stopped working)?
>I have setup a lustre system for testing consisting of four OST''s and one >MDT. It seems to work fine for about a day. At the end of about 24 hours, >the clients can no longer read or write the mount point (although a file >listing (ls) works).That is the problem. Your clients are mounting wrong. You have used incorrect formatting of the nodes.>For example, a mkdir yields a "cannot create directory >''/datafs/temp'': Identifier removed", and the temp dir does not exist. >A file listing of the /datafs directory comes back complete and correct, >but if I try to ls a subdirectory it gives me the erorr "ls: /datafs >/test2: >Identifier removed".Please review via your bash history the exact commands you used to make the underlying filesystem. Be certain everything is pointing to the correct filesystem and to the correct directories.>The client is mounting the dir to /datafs. This worked fine eariler, I >left >for the day, came back in and this error is occurring on all clients >(albeit >I only have three clients for testing). All clients/servers are running >RHEL5, and the lustre was installed via rpms as per the manual.The client if you followed the manual 100% (takes practice) should be mounting your combined MDT/MGS node at the MDT/MGS Node IP address via your network for example, tcp0 on a local mountpoint likened to /mnt/datafs. I found that changing the manual representations of your new filesystem to something other than datafs or testfs or spfs. In your case I would recommend the word litefs. Also there are some ambiguities with slashes in the examples and I might ad use or misuse of the = sign after fsname. By far the best example is further into the manual about mounting external journals. Also it is best to have the MGS and MDT separate from everything I have read. Otherwise you must on your combined MDT/MGS node have two mount points /mnt/mgs and /mnt/data/mdt.>Out of curiosity, if I go to the server and do an ls on /mnt/data/mdt or >to the OST server and do an ls on /mnt/data/ost1, I get an error that >it is not a directory (although that could be normal, I am not sure).Yes that is normal because those are mount points not directories.>A cat of /proc/fs/lustre/devices on the mdt does not show anything out >of place >(or at least, it is the same as when I started the lustre and mounted >the servers/clients)So we assume your combined MDT/MGS is up and running but is it formatted properly and mounted properly?>I have configured it all according to >http://manual.lustre.org/manual/LustreManual16_HTML >/ConfiguringLustreExamples.html#50548848_pgfId-1286919 >as per section 6.1.1.2 Configuration Generation and Application, using >one server >for the MGT and MDS, and I have four OSTs, just like the example.>Has anyone seen this before?Yes and it is common until you become good enough at creating your Lustre filesystem and knowing which formatting and mounting procedures interact to make a live filesystem that you adopt and know to be sound. Robert to simplify things I''ll include some of my .bash_history on the nodes for you to examine. This should considerably decrease your initial configuration timeframe. My configuration differs in that I opt for seperate MGS and MDT. This obviously is from the MDT. umount /mnt/mgs mdadm -S /dev/md2 mdadm -S /dev/md1 mdadm -S /dev/md0 mdadm --zero-superblock /dev/sdb mdadm --zero-superblock /dev/sdc mdadm --zero-superblock /dev/sdd mdadm --zero-superblock /dev/sde mdadm --zero-superblock /dev/sdf mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde sfdisk -uC /dev/sdf << EOF mke2fs -b 4096 -O journal_dev /dev/sdf1 cat /proc/mdstat mkfs.lustre --mgs --fsname=ioio --mkfsoptions="-J device=/dev/sdf1" --reformat /dev/md0 rm /etc/mdadm.conf mdadm --detail --scan --verbose > /etc/mdadm.conf mount -t lustre /dev/md0 /mnt/mgs e2label /dev/md0 vi /etc/fstab e2label /dev/md0 cat /proc/mdstat mount -t lustre 192.168.0.7 at tcp0:/ioio /mnt/ioio lctl dl lfs df -h This shows a single MGS with an external journal on /dev/sdf1. The MGS is mounted on /mnt/mgs by the /dev/md0 devices. The e2label of which will be label=MGS followed by mount options in the /etc/fstab. Here you can see I connect a client to the MGS to test the filesystem but only after the MDT is mounted and the OSS are mounted. On the MDT umount /mnt/data/mdt mdadm -S /dev/md2 mdadm -S /dev/md0 mdadm -S /dev/md1 mdadm --zero-superblock /dev/sdb mdadm --zero-superblock /dev/sdc mdadm --zero-superblock /dev/sdd mdadm --zero-superblock /dev/sde mdadm --zero-superblock /dev/sdf mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde sfdisk -uC /dev/sdf << EOF mke2fs -b 4096 -O journal_dev /dev/sdf1 cat /proc/mdstat mkfs.lustre --mdt --fsname=ioio --mgsnode=192.168.0.7 at tcp0 --mkfsoptions="-J device=/dev/sdf1" --reformat /dev/md0 mount -t lustre /dev/md0 /mnt/data/mdt rm /etc/mdadm.conf mdadm --detail --scan --verbose > /etc/mdadm.conf e2label /dev/md0 vi /etc/fstab shutdown -r -t secs: 0 When this MDT comes back online your filesystems shall be mounted correctly as identified by lctl dl. And typical of an OST. Choose whatever raid level you require. umount /mnt/data/ost0 cat /proc/mdstat mdadm -S /dev/md0 mdadm --zero-superblock /dev/sdb mdadm --zero-superblock /dev/sdc mdadm --zero-superblock /dev/sdd mdadm --zero-superblock /dev/sde mdadm --zero-superblock /dev/sdf mdadm --zero-superblock /dev/sdg mdadm --zero-superblock /dev/sdh mdadm -v --create --assume-clean /dev/md0 --level=raid10 --raid-devices=6 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh cat /proc/mdstat sfdisk -uC /dev/sdb << EOF mke2fs -b 4096 -O journal_dev /dev/sdb1 mkfs.lustre --ost --fsname=ioio --mgsnode=192.168.0.7 at tcp0 --mkfsoptions="-J device=/dev/sdb1" --reformat /dev/md0 mount -t lustre /dev/md0 /mnt/data/ost0 rm /etc/mdadm.conf mdadm --detail --scan --verbose > /etc/mdadm.conf e2label /dev/md0 vi /etc/fstab cat /proc/mdstat shutdown -r -t secs: 0 When this box comes back up the newly formatted OST should be mounted. If not your e2label is incorrect as does happen and is mentioned in the manual that e2label won''t report correctly until the devices is mounted the first time. Robert I hope this helps to speed your testing deployment. It will take you probably 2 or three attempts to get a viable filesystem with all the variables in play and your naming conventions. Eventually you will end up wanting to have external journals as laid out above. Also be sure to follow your directory naming conventions right through. For example you mount the OST and subsequent /dev/md0 device on /mnt/data/ost0 don''t be shortening the path as I suspect you have on your OST mounts. Robert _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss