Hi there, I'm using RH7.2 (with the 2.4.9-30 kernal and it's required components) as a base for a server system running kernel 2.4.18. I've gone to this version to get around non-performing aic7xxx drivers in the stock 7.2 kernels, and updated gigabit ethernet drivers. I have a raid unit (Medea) attached to an Adaptec 3916, coming up as sdb. It has 2kb blocks, but the fault I'm about to talk about is evident on other block sizes and controllers as well. I have a little script that makes a bunch of large files to give the filesystem a beating. It goes like this: #!/bin/bash i=1000 for((i=0;i != 301;i++)); do time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024 echo $i done About 4 files in, it dies with 'dd: writing `./bigfile.4': Read-only file system' In the messages file, I see: Mar 27 17:28:15 r5 kernel: journal_bmap: journal block not found at offset 1607 on sd(8,17) Mar 27 17:28:15 r5 kernel: Aborting journal on device sd(8,17). Mar 27 17:28:15 r5 kernel: ext3_abort called. Mar 27 17:28:15 r5 kernel: EXT3-fs abort (device sd(8,17)): ext3_journal_start: Detected aborted journal Mar 27 17:28:15 r5 kernel: Remounting filesystem read-only Mar 27 17:28:15 r5 kernel: EXT3-fs error (device sd(8,17)) in start_transaction: Journal has aborted If I reformat the drive as ext2, it behaves quite fine through the whole test. I'm no longer logging any scsi errors (RH7.2 with 2.4.9-30 did plenty). Are there any known issues of this type about? I know nothing about the internals of the filesystem and am just trying to get up and running. Are there any tests I can run to help point fingers at potential problems? Running the same test on my system hard disk (also on an adaptec controller) runs out of space before it crashes, but I expect it's solid. I'd appreciate the help :) Thanks, Tony ---- Tony Clark e: tony@rsp.com.au Rising Sun Pictures w: http://www.rsp.com.au Adelaide / Sydney, Australia t: +61 8 8364 6074
Hi Again, I suspect I may need to retract that last one, as I just forced a fatal error on EXT2 as well... I'd appreciate any thoughts people have in terms of locating the fault, although I suspect it's in the raid. Thanks, Tony On Wednesday, March 27, 2002, at 04:42 PM, Tony Clark wrote:> Hi there, > > I'm using RH7.2 (with the 2.4.9-30 kernal and it's required components) > as a base for a server system running kernel 2.4.18. I've gone to this > version to get around non-performing aic7xxx drivers in the stock 7.2 > kernels, and updated gigabit ethernet drivers. > > I have a raid unit (Medea) attached to an Adaptec 3916, coming up as > sdb. It has 2kb blocks, but the fault I'm about to talk about is > evident on other block sizes and controllers as well. > > I have a little script that makes a bunch of large files to give the > filesystem a beating. It goes like this: > > #!/bin/bash > i=1000 > for((i=0;i != 301;i++)); do > time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024 > echo $i > done > > About 4 files in, it dies with 'dd: writing `./bigfile.4': Read-only > file system' > > In the messages file, I see: > > Mar 27 17:28:15 r5 kernel: journal_bmap: journal block not found at > offset 1607 on sd(8,17) > Mar 27 17:28:15 r5 kernel: Aborting journal on device sd(8,17). > Mar 27 17:28:15 r5 kernel: ext3_abort called. > Mar 27 17:28:15 r5 kernel: EXT3-fs abort (device sd(8,17)): > ext3_journal_start: Detected aborted journal > Mar 27 17:28:15 r5 kernel: Remounting filesystem read-only > Mar 27 17:28:15 r5 kernel: EXT3-fs error (device sd(8,17)) in > start_transaction: Journal has aborted > > If I reformat the drive as ext2, it behaves quite fine through the > whole test. I'm no longer logging any scsi errors (RH7.2 with 2.4.9-30 > did plenty). > > Are there any known issues of this type about? I know nothing about > the internals of the filesystem and am just trying to get up and > running. Are there any tests I can run to help point fingers at > potential problems? > > Running the same test on my system hard disk (also on an adaptec > controller) runs out of space before it crashes, but I expect it's > solid. > > I'd appreciate the help :) > > Thanks, > > Tony >---- Tony Clark e: tony@rsp.com.au Rising Sun Pictures w: http://www.rsp.com.au Adelaide / Sydney, Australia t: +61 8 8364 6074> > > > _______________________________________________ > Ext3-users mailing list > Ext3-users@redhat.com > https://listman.redhat.com/mailman/listinfo/ext3-users
Tony Clark wrote:> I have a little script that makes a bunch of large files to give the > filesystem a beating. It goes like this: > > #!/bin/bash > i=1000 > for((i=0;i != 301;i++)); do > time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024 > echo $i > done > > About 4 files in, it dies with 'dd: writing `./bigfile.4': Read-only > file system' >This may be a dumb question, but you do have more than 4GB of space available on the RAID array when you start, correct? From this script, you'll need 300GB in order to not run out, because when the block size for dd is 1024k (1 meg), and the count is 1024, that's 1024 megabytes, or 1GB. Since you're writing out 300 of those 1GB files, I could see why the filesystem might run out of room. Although the "journal block not found" message *does* point to general data corruption, you never know, this might have something to do with it too.