Hiya, don't know where else to report this. Please correct me if this isn't the right place. I just ran into a serious bug :(( We were trying to create a virtual filesystem in an image (file) of around 238 GB. Let the files name be foo.img, then we did: losetup /dev/loop0 foo.img and then used fdisk /dev/loop0 to create this partition table: uxley:~>fdisk -lu /dev/loop0 Disk /dev/loop0: 238.3 GB, 238370684928 bytes 255 heads, 63 sectors/track, 28980 cylinders, total 465567744 sectors Units = sectors of 1 * 512 = 512 bytes Device Boot Start End Blocks Id System /dev/loop0p1 * 63 401624 200781 83 Linux /dev/loop0p2 401625 16048934 7823655 83 Linux /dev/loop0p3 16048935 21928724 2939895 82 Linux swap / Solaris /dev/loop0p4 21928725 465563699 221817487+ 5 Extended /dev/loop0p5 21928788 27808514 2939863+ 83 Linux /dev/loop0p6 27808578 47359619 9775521 83 Linux /dev/loop0p7 47359683 57143204 4891761 83 Linux /dev/loop0p8 57143268 465563699 204210216 83 Linux Next we did: losetup -o $((512 * 63)) /dev/loop1 /dev/loop0 which should make the first partition available under /dev/loop1 (this certainly works if that partition already contains a fs, we then can mount it). Finally, I wanted to create a filesystem and ran the following command: uxley:~>mke2fs -j -L "/boot" /dev/loop1 mke2fs 1.40-WIP (14-Nov-2006) Filesystem label=/boot OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 29097984 inodes, 58195960 blocks 2909798 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=0 1776 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872 Writing inode tables: 306/1776 Here the machine completely halted/crashed. I don't know what happened, because it's a remote machine. The writing of the inode table started very fast, but it was already slowing down the last few - and completely stopped at 306, which was 12 minutes ago (my ssh connection to the machine still didn't time out, weird enough). I can still ping the machine I see. Note that mke2fs says: 29097984 inodes, 58195960 blocks That is 58195960 * 4096 = 238370652160 the full size of the image file?!? This partition is only 200MB though! Did I do something very stupid, or is this a bug in mke2fs ? -- Carlo Wood <carlo at alinoe.com>
I don't know how this can hang your system, but instead of doing this: losetup -o $((512 * 63)) /dev/loop1 /dev/loop0 You could use kpartx: kpartx -a /dev/loop0 You are going to find in /dev/mapper your loop0p1: Here you can find an example: [root at shuVak ~]# dd if=/dev/zero of=caca bs=1024k count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 0.656977 seconds, 160 MB/s [root at shuVak ~]# losetup /dev/loop0 caca [root at shuVak ~]# fdisk /dev/loop0 Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Command (m for help): p Disk /dev/loop0: 104 MB, 104857600 bytes 255 heads, 63 sectors/track, 12 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-12, default 1): Using default value 1 Last cylinder or +size or +sizeM or +sizeK (1-12, default 12): Using default value 12 Command (m for help): p Disk /dev/loop0: 104 MB, 104857600 bytes 255 heads, 63 sectors/track, 12 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/loop0p1 1 12 96358+ 83 Linux Command (m for help): t Selected partition 1 Hex code (type L to list codes): 8e Changed system type of partition 1 to 8e (Linux LVM) Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 22: Invalid argument. The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks. [root at shuVak ~]# ls /dev/loop* loop0 loop1 loop2 loop3 loop4 loop5 loop6 loop7 [root at shuVak ~]# kpartx -a /dev/loop0 [root at shuVak ~]# ls /dev/mapper/loop0p1 /dev/mapper/loop0p1 regards, Jordi On Fri, Oct 24, 2008 at 3:16 AM, Carlo Wood <carlo at alinoe.com> wrote:> Hiya, don't know where else to report this. Please > correct me if this isn't the right place. > > I just ran into a serious bug :(( > > We were trying to create a virtual filesystem > in an image (file) of around 238 GB. > > Let the files name be foo.img, then we did: > > losetup /dev/loop0 foo.img > > and then used fdisk /dev/loop0 to create this partition > table: > > uxley:~>fdisk -lu /dev/loop0 > > Disk /dev/loop0: 238.3 GB, 238370684928 bytes > 255 heads, 63 sectors/track, 28980 cylinders, total 465567744 sectors > Units = sectors of 1 * 512 = 512 bytes > > Device Boot Start End Blocks Id System > /dev/loop0p1 * 63 401624 200781 83 Linux > /dev/loop0p2 401625 16048934 7823655 83 Linux > /dev/loop0p3 16048935 21928724 2939895 82 Linux swap / Solaris > /dev/loop0p4 21928725 465563699 221817487+ 5 Extended > /dev/loop0p5 21928788 27808514 2939863+ 83 Linux > /dev/loop0p6 27808578 47359619 9775521 83 Linux > /dev/loop0p7 47359683 57143204 4891761 83 Linux > /dev/loop0p8 57143268 465563699 204210216 83 Linux > > Next we did: > > losetup -o $((512 * 63)) /dev/loop1 /dev/loop0 > > which should make the first partition available under /dev/loop1 > (this certainly works if that partition already contains a fs, > we then can mount it). > > Finally, I wanted to create a filesystem and ran the following > command: > > uxley:~>mke2fs -j -L "/boot" /dev/loop1 > mke2fs 1.40-WIP (14-Nov-2006) > Filesystem label=/boot > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > 29097984 inodes, 58195960 blocks > 2909798 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=0 > 1776 block groups > 32768 blocks per group, 32768 fragments per group > 16384 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, > 4096000, 7962624, 11239424, 20480000, 23887872 > > Writing inode tables: 306/1776 > > > Here the machine completely halted/crashed. I don't know what > happened, because it's a remote machine. > > The writing of the inode table started very fast, but it was > already slowing down the last few - and completely stopped > at 306, which was 12 minutes ago (my ssh connection to the > machine still didn't time out, weird enough). > > I can still ping the machine I see. > > Note that mke2fs says: 29097984 inodes, 58195960 blocks > That is 58195960 * 4096 = 238370652160 the full size of > the image file?!? > > This partition is only 200MB though! > > Did I do something very stupid, or is this a bug in mke2fs ? > > -- > Carlo Wood <carlo at alinoe.com> > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users >-- Jordi
On Fri, Oct 24, 2008 at 03:16:30AM +0200, Carlo Wood wrote:> Hiya, don't know where else to report this. Please > correct me if this isn't the right place. > > I just ran into a serious bug :(( > > We were trying to create a virtual filesystem > in an image (file) of around 238 GB. [Using double losetup configuration] > > Here the machine completely halted/crashed. I don't know what > happened, because it's a remote machine. > > The writing of the inode table started very fast, but it was > already slowing down the last few - and completely stopped > at 306, which was 12 minutes ago (my ssh connection to the > machine still didn't time out, weird enough).That's a classic case of mke2fs tickling a VM bug. The VM should be able to do proper write throttling, but mke2fs writes a blocks very quickly, and so it's a great test of the kernel virtual memory subsystem. :-) So the fact that your system hung is a kernel bug, probably caued by the double /dev/loop configuration. What version of the kernel are you using? There is a workaround that might help: "export MKE2FS_SYNC=10". This will force an explicit sync system call every 10 blockgroups, which tends to work around the kernel VM bug. It's not the default mainly because mke2fs is such a great kernel test tool, and the VM really needs to be able to handle this case.> Note that mke2fs says: 29097984 inodes, 58195960 blocks > That is 58195960 * 4096 = 238370652160 the full size of > the image file?!? > > This partition is only 200MB though!That's because you created /dev/loop1 as a loop device with an offset of 512*63 bytes from the beginning of /dev/loop0. There is no way to set the maximum size of a loop device (it's not something which is currently defined as part of the interface of the LOOP_SET_STATUS ioctl. If you want to do things manually like this, you'll need to explicitly specify the size of the desired filesystem to mke2fs; it's a shortcoming in the loop device. The other way to do things would be to create an image file of the desired partition length, and then assemble it by hand afterwards; sorry, the loop device wasn't designed to be used to emulate a partitioned disk. It could be, but kernel patches would be required to extend its functionality. Regards, - Ted
Carlo Wood wrote:> Hiya, don't know where else to report this. Please > correct me if this isn't the right place. > > I just ran into a serious bug :((...> Finally, I wanted to create a filesystem and ran the following > command: > > uxley:~>mke2fs -j -L "/boot" /dev/loop1...> Here the machine completely halted/crashed. I don't know what > happened, because it's a remote machine.It'd be very good to have a console so you can see what really truly happened. A remote machine w/o a console would scare me in any case. :) Is the image file sparse, or is it filled in with zeros? Is it hosted on ext3? Especially if it's sparse, but in either case, I'd be curious to know if it works out any better or worse with other filesystems hosting the image file - trying ext4 and/or xfs just as an experiment might be interesting... -Eric