Hello All: My research group recently invested in a 17 node Linux rackmounted cluster. It was delivered recently and, being the lowly graduate student that I am, I was told to 'make sure that it works.' After investigating it, I noticed a problem with two of the hard drives one nodes 14 and 16. On most of the nodes, a 'df -hT' will give you the following: Filesystem Type Size Used Avail Use% Mounted On /dev/hda3 ext3 35G 1.7G 31G 6% / /dev/hda1 ext3 99M 14M 79M 15% /boot none tmpfs 250M 0 250M 0% /dev/shm master:/home nfs 660GB 33M 626G 1% /home which is as it should be. Note that the /home directory on all the nodes is nfs'ed over to our RAID array. However, on the faulty nodes, the first line of 'df -hT' is instead: /dev/hda3 ext3 3.6G 1.9G 1.6G 53% / It's pretty clear that a typo was made when these two drives were partitioned, leaving about 80% of the hard drive unallocated, and the / partition a tenth of the size it should be. The company that sold us the cluster has be less supportive than I would have liked. My question, then, is: What is the simplest way to go about getting the partition up to it's correct size? It seems that I have three possible solutions: 1) expand the partition, 2) copy over the faulty drive with one of the correct node HDs, or 3) reinstall the OS on those drives. I would really like to avoid reinstalling the OS, since that would mean a lot of extra work in order to get the node reconfigured to operate in the cluster again (I think). I also considered just allocating the extra space as a new partition (perhaps mounted on /usr, since space on /home is not an issue), unique to those two nodes, but I am concerned that a 3.6 GB / partition may not be sufficient. But perhaps this isn't such a cause for concern. My understanding of programs like 'parted' is that they probably won't be able to help with expanding the / partition, either because it is the / partition in question or because its a ext3 partition (is this correct?). So I was looking for ways to copy one hard drive to another easily -- and the changes I will have to make manually so that the drive realizes it is node14 and not a second node1. So, does anyone have a supremely elegant solution for this problem? Or maybe a just a solution? Thanks, Ryan Roth Osgood Group / ISE Columbia University rothr@cumsl.msl.columbia.edu
On Sat, Apr 26, 2003 at 05:15:08PM -0400, rothr@cumsl.msl.columbia.edu wrote:> > However, on the faulty nodes, the first line of 'df -hT' is instead: > > /dev/hda3 ext3 3.6G 1.9G 1.6G 53% / > > It's pretty clear that a typo was made when these two drives were > partitioned, leaving about 80% of the hard drive unallocated, and the / > partition a tenth of the size it should be. The company that sold us the > cluster has be less supportive than I would have liked. > > My question, then, is: What is the simplest way to go about getting the > partition up to it's correct size?1) Boot from a rescue CD-ROM that has fdisk and the resize2fs program on it. 2) Use "fdisk /dev/hda" and make sure the hda3 partition is correctly sized. Make sure nothing else from /dev/hda is mounted when you run fdisk, so that the kernel will re-read the partition table. Otherwise, you may need to reboot the system after you run fdisk. 3) Use the command "resize2fs /dev/hda3" to expand the filesystem on /dev/hda3 to the full size of the hda3 partition. - Ted
Hello: Thanks for your help, Ted. Unfortunately, I ran into some more problems. I wasn't able to follow your advice exactly; only one of the nodes in our cluster has a CD-ROM drive, and I wasn't given a rescue CD by the company. Instead, what I did was to first back up the affected drive to our RAID array. I did this manually with 'cp -dpR' I then removed the affected drive and connected it to a different node as a slave drive (/dev/hdb). From there, I was able to alter the partition. I ended up using parted, because I was hoping that it's 'resize' command would keep the files on the drive intact. I probably made a mistake while doing so -- a little investigation showed that the files were inaccessible after I changed the partition size. Resigned to the lost of files and thankful to have a backup, I used mkfs to reformat the partition. I then restored the original files to the partition, again using 'cp -dpR' When I returned the drive to it's original node, the kernel seemed to boot up well enough at first. However, it eventually tries to 'Checking root filesystem' and stops, complaining that it cannot locate a filesystem with label = '/'. It then gives me the option of running a shell in rescue mode before rebooting. I believe I have missed a detail in the way the filesystem has to be setup for a partition mounted as /. I would appreciate any insight into what is causing this problem and how it might be mended. Thank you for helping the novice, Ryan Roth Osgood Group / ISE Columbia University rothr@cumsl.msl.columbia.edu PS: Almost forgot: I'm using Red Hat 7.3 with ext3 filesystems. ---------- Forwarded message ---------- Date: Sat, 26 Apr 2003 23:54:06 -0400 From: Theodore Ts'o <tytso@mit.edu> To: rothr@cumsl.msl.columbia.edu Cc: ext3-users@redhat.com Subject: Re: Duplicating Hard Drive Problem On Sat, Apr 26, 2003 at 05:15:08PM -0400, rothr@cumsl.msl.columbia.edu wrote:> > However, on the faulty nodes, the first line of 'df -hT' is instead: > > /dev/hda3 ext3 3.6G 1.9G 1.6G 53% / > > It's pretty clear that a typo was made when these two drives were > partitioned, leaving about 80% of the hard drive unallocated, and the / > partition a tenth of the size it should be. The company that sold us the > cluster has be less supportive than I would have liked. > > My question, then, is: What is the simplest way to go about getting the > partition up to it's correct size?1) Boot from a rescue CD-ROM that has fdisk and the resize2fs program on it. 2) Use "fdisk /dev/hda" and make sure the hda3 partition is correctly sized. Make sure nothing else from /dev/hda is mounted when you run fdisk, so that the kernel will re-read the partition table. Otherwise, you may need to reboot the system after you run fdisk. 3) Use the command "resize2fs /dev/hda3" to expand the filesystem on /dev/hda3 to the full size of the hda3 partition. - Ted