thr3ads.net - Ext3 users - Linux 2.4.18 on RH 7.2

If this information is useful, please help other people find it:
Share via:

Tony Clark

2002-Mar-27 06:12 UTC

Linux 2.4.18 on RH 7.2 - odd failures

Hi there,

I'm using RH7.2 (with the 2.4.9-30 kernal and it's required components) 
as a base for a server system running kernel 2.4.18.  I've gone to this 
version to get around non-performing aic7xxx drivers in the stock 7.2 
kernels, and updated gigabit ethernet drivers.

I have a raid unit (Medea) attached to an Adaptec 3916, coming up as 
sdb.  It has 2kb blocks, but the fault I'm about to talk about is 
evident on other block sizes and controllers as well.

I have a little script that makes a bunch of large files to give the 
filesystem a beating.  It goes like this:

#!/bin/bash
i=1000
for((i=0;i != 301;i++)); do
   time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024
   echo $i
done

About 4 files in, it dies with 'dd: writing `./bigfile.4': Read-only 
file system'

In the messages file, I see:

Mar 27 17:28:15 r5 kernel: journal_bmap: journal block not found at 
offset 1607 on sd(8,17)
Mar 27 17:28:15 r5 kernel: Aborting journal on device sd(8,17).
Mar 27 17:28:15 r5 kernel: ext3_abort called.
Mar 27 17:28:15 r5 kernel: EXT3-fs abort (device sd(8,17)): 
ext3_journal_start: Detected aborted journal
Mar 27 17:28:15 r5 kernel: Remounting filesystem read-only
Mar 27 17:28:15 r5 kernel: EXT3-fs error (device sd(8,17)) in 
start_transaction: Journal has aborted

If I reformat the drive as ext2, it behaves quite fine through the whole 
test.  I'm no longer logging any scsi errors (RH7.2 with 2.4.9-30 did 
plenty).

Are there any known issues of this type about?  I know nothing about the 
internals of the filesystem and am just trying to get up and running.  
Are there any tests I can run to help point fingers at potential 
problems?

Running the same test on my system hard disk (also on an adaptec 
controller) runs out of space before it crashes, but I expect it's solid.

I'd appreciate the help :)

Thanks,

Tony

----
Tony Clark			    	     e: tony@rsp.com.au
Rising Sun Pictures			 w: http://www.rsp.com.au
Adelaide / Sydney, Australia t: +61 8 8364 6074

Tony Clark

2002-Mar-27 06:55 UTC

head link

Re: Linux 2.4.18 on RH 7.2 - odd failures

Hi Again,

I suspect I may need to retract that last one, as I just forced a fatal 
error on EXT2 as well...  I'd appreciate any thoughts people have in 
terms of locating the fault, although I suspect it's in the raid.

Thanks,

Tony

On Wednesday, March 27, 2002, at 04:42  PM, Tony Clark wrote:
> Hi there,
>
> I'm using RH7.2 (with the 2.4.9-30 kernal and it's required
components)
> as a base for a server system running kernel 2.4.18.  I've gone to this
> version to get around non-performing aic7xxx drivers in the stock 7.2 
> kernels, and updated gigabit ethernet drivers.
>
> I have a raid unit (Medea) attached to an Adaptec 3916, coming up as 
> sdb.  It has 2kb blocks, but the fault I'm about to talk about is 
> evident on other block sizes and controllers as well.
>
> I have a little script that makes a bunch of large files to give the 
> filesystem a beating.  It goes like this:
>
> #!/bin/bash
> i=1000
> for((i=0;i != 301;i++)); do
>   time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024
>   echo $i
> done
>
> About 4 files in, it dies with 'dd: writing `./bigfile.4':
Read-only
> file system'
>
> In the messages file, I see:
>
> Mar 27 17:28:15 r5 kernel: journal_bmap: journal block not found at 
> offset 1607 on sd(8,17)
> Mar 27 17:28:15 r5 kernel: Aborting journal on device sd(8,17).
> Mar 27 17:28:15 r5 kernel: ext3_abort called.
> Mar 27 17:28:15 r5 kernel: EXT3-fs abort (device sd(8,17)): 
> ext3_journal_start: Detected aborted journal
> Mar 27 17:28:15 r5 kernel: Remounting filesystem read-only
> Mar 27 17:28:15 r5 kernel: EXT3-fs error (device sd(8,17)) in 
> start_transaction: Journal has aborted
>
> If I reformat the drive as ext2, it behaves quite fine through the 
> whole test.  I'm no longer logging any scsi errors (RH7.2 with 2.4.9-30
> did plenty).
>
> Are there any known issues of this type about?  I know nothing about 
> the internals of the filesystem and am just trying to get up and 
> running.  Are there any tests I can run to help point fingers at 
> potential problems?
>
> Running the same test on my system hard disk (also on an adaptec 
> controller) runs out of space before it crashes, but I expect it's 
> solid.
>
> I'd appreciate the help :)
>
> Thanks,
>
> Tony
>----
Tony Clark			    	     e: tony@rsp.com.au
Rising Sun Pictures			 w: http://www.rsp.com.au
Adelaide / Sydney, Australia t: +61 8 8364 6074>
>
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users@redhat.com
> https://listman.redhat.com/mailman/listinfo/ext3-users

Bryan Kadzban

2002-Mar-27 12:56 UTC

head link

Re: Linux 2.4.18 on RH 7.2 - odd failures

Tony Clark wrote:> I have a little script that makes a bunch of large files to give the 
> filesystem a beating.  It goes like this:
> 
> #!/bin/bash
> i=1000
> for((i=0;i != 301;i++)); do
>   time -p dd if=/dev/zero of=./bigfile.$i bs=1024k count=1024
>   echo $i
> done
> 
> About 4 files in, it dies with 'dd: writing `./bigfile.4':
Read-only
> file system'
> 
This may be a dumb question, but you do have more than 4GB of space available on
the RAID array when you start, correct?  From this script, you'll need 300GB
in
order to not run out, because when the block size for dd is 1024k (1 meg), and 
the count is 1024, that's 1024 megabytes, or 1GB.  Since you're writing
out 300
of those 1GB files, I could see why the filesystem might run out of room.

Although the "journal block not found" message *does* point to general
data
corruption, you never know, this might have something to do with it too.

Reasonably Related Threads

Search for more maybe matching threads

Ext3 users - Mar 2002 - Linux 2.4.18 on RH 7.2 - odd failures

Linux 2.4.18 on RH 7.2 - odd failures

Re: Linux 2.4.18 on RH 7.2 - odd failures

Re: Linux 2.4.18 on RH 7.2 - odd failures

Reasonably Related Threads