Hi, the raid info keeps disappearing!, and im not convinced it's hardware. I've tried it on 2 different - but similar - boxes, after some time, kernel reports something like: Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 bn 0; cn 0 tn 0 sn 0) trying PIO mode Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 bn 0; cn 0 tn 0 sn 0) trying PIO mode Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 bn 0; cn 0 tn 0 sn 0) status=51 error=10 Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 bn 0; cn 0 tn 0 sn 0) status=51 error=10 Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken from dmesg: ... FreeBSD 4.10-STABLE #7: Fri Jul 2 09:57:10 IDT 2004 ... ar0: 381564MB <ATA RAID0 array> [48642/255/63] status: READY subdisks: 0 READY ad4: 190782MB <ST3200822A> [387621/16/63] at ata2-master UDMA100 1 READY ad6: 190782MB <ST3200822A> [387621/16/63] at ata3-master UDMA100 i've partitioned the disk so: # size offset fstype [fsize bsize bps/cpg] a: 1024000 0 4.2BSD 2048 16384 90 # (Cyl. 0 - 63*) b: 8388608 1024000 swap # (Cyl. 63*- 585*) c: 781433667 0 unused 0 0 # (Cyl. 0 - 48641*) d: 1024000 9412608 4.2BSD 0 0 0 # (Cyl. 585*- 649*) h: 770997059 10436608 4.2BSD 0 0 0 # (Cyl. 649*- 48641*) the machine boots diskless, so just to check the disk i did a newfs -U to /dev/ar0s1a, then restored a root image unto it, no problems. the h partition has a big postgres data base, starting postgres i get the above error, notice that the error is a bit suspicious, fsbn 0 ( ... bn 0; cn 0; tn 0; sn 0) using the Fastrack/Promise Bios i reconfigure the Raid, and if tried the above again with the same results. btw, on a different host, same motherboard, same type of disks, with a older kernel, it panics, but the disk error is the same, and the array info is lost. Any more info/help needed to track this down? thanks, danny
On Fri, 2 Jul 2004, Danny Braniss wrote:> Hi, > the raid info keeps disappearing!, and im not convinced it's > hardware. I've tried it on 2 different - but similar - boxes, after some > time, kernel reports something like: > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 > bn 0; cn 0 tn 0 sn 0) trying PIO modeHow did you construct the array volume? This looks like one of the offsets in the disklabel is wrong.> Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > bn 0; cn 0 tn 0 sn 0) trying PIO mode > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > from dmesg: > ... > FreeBSD 4.10-STABLE #7: Fri Jul 2 09:57:10 IDT 2004 > ... > ar0: 381564MB <ATA RAID0 array> [48642/255/63] status: READY subdisks: > 0 READY ad4: 190782MB <ST3200822A> [387621/16/63] at ata2-master UDMA100 > 1 READY ad6: 190782MB <ST3200822A> [387621/16/63] at ata3-master > UDMA100 > > i've partitioned the disk so: > # size offset fstype [fsize bsize bps/cpg] > a: 1024000 0 4.2BSD 2048 16384 90 # (Cyl. 0 - 63*) > b: 8388608 1024000 swap # (Cyl. 63*- 585*) > c: 781433667 0 unused 0 0 # (Cyl. 0 - 48641*) > d: 1024000 9412608 4.2BSD 0 0 0 # (Cyl. 585*- 649*) > h: 770997059 10436608 4.2BSD 0 0 0 # (Cyl. 649*- 48641*) > > the machine boots diskless, so just to check the disk i did a newfs -U to > /dev/ar0s1a, > then restored a root image unto it, no problems. > > the h partition has a big postgres data base, starting postgres i get the > above error, notice that > the error is a bit suspicious, fsbn 0 ( ... bn 0; cn 0; tn 0; sn 0) > > using the Fastrack/Promise Bios i reconfigure the Raid, and if tried the above > again with > the same results. > > btw, on a different host, same motherboard, same type of disks, with a older > kernel, > it panics, but the disk error is the same, and the array info is lost. > > > Any more info/help needed to track this down? > > thanks, > danny > > > > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >-- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org
On Sat, 3 Jul 2004, Danny Braniss wrote:> > > How did you construct the array volume? This looks like one of the > > > offsets in the disklabel is wrong. > > > > > > > I used sysinstall, and disklabel -e to change the partition letters. > > The problem appears after several hours of disk usage, and the only > > partition in use is h. > > > whops, rereading the question, here is the correct answer, sorry. > > I used the bios to define the raid0, stripe, 2 disks. > (the menu is 'fool-proof', so i guess i couldn't have made a mistook :-)OK, so you used the ATA controller's menu to construct the RAID array, then used sysinstall to slice the ensuing volume. Hm. It appears that writes to sector 0 are disallowed. Can you try installing without the array defined, to make sure the disks are writable otherwise?> > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > > > > bn 0; cn 0 tn 0 sn 0) trying PIO mode > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 > > > > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > > > > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken-- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org
> > ----- Original Message ----- > From: "Doug White" <dwhite@gumbysoft.com> > To: "Danny Braniss" <danny@cs.huji.ac.il> > Cc: <freebsd-stable@freebsd.org>; "Soren Schmidt" <sos@FreeBSD.org> > Sent: Sunday, July 04, 2004 4:45 AM > Subject: Re: problems with RAID0 and Intel/SE7501WV2/Promise > > > > OK, so you used the ATA controller's menu to construct the RAID array, > > then used sysinstall to slice the ensuing volume. Hm. It appears that > > writes to sector 0 are disallowed. Can you try installing without the > > array defined, to make sure the disks are writable otherwise? > > Can this be some kind of virus protection (against boot sector viruses) > setting in the system bios ? Some systems have this, and activate it when > doing a 'load setup defaults'. Don't know if this applies to RAID arrays > though. >i doubt it, since the program/kernel had INHO no reason to write there in the first place. danny
> On Sat, 3 Jul 2004, Danny Braniss wrote: > > > > > How did you construct the array volume? This looks like one of the > > > > offsets in the disklabel is wrong. > > > > > > > > > > I used sysinstall, and disklabel -e to change the partition letters. > > > The problem appears after several hours of disk usage, and the only > > > partition in use is h. > > > > > whops, rereading the question, here is the correct answer, sorry. > > > > I used the bios to define the raid0, stripe, 2 disks. > > (the menu is 'fool-proof', so i guess i couldn't have made a mistook :-) > > OK, so you used the ATA controller's menu to construct the RAID array, > then used sysinstall to slice the ensuing volume. Hm. It appears that > writes to sector 0 are disallowed. Can you try installing without the > array defined, to make sure the disks are writable otherwise? >the problem appears after several hours of reading/writing - actually building a huge postgres database, the partition is h, not even close to the begining of the disk, and IMHO the kernel/postgres have no reason to write on fsbn 0 (ad4 bn 0; cn 0 tn 0 sn 0) in any case. btw, a similar system is working ok, with the small difference in the size of the disks, 120GBx2 vs. 200GBx2 RAID0> > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > > > > > bn 0; cn 0 tn 0 sn 0) trying PIO mode > > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad6: hard error writing fsbn 0 (ad6 > > > > > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ad4: hard error writing fsbn 0 (ad4 > > > > > bn 0; cn 0 tn 0 sn 0) status=51 error=10 > > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > > > > Jul 2 10:18:35 cs7.cs.huji.ac.il /kernel: ar0: ERROR - array broken > > -- > Doug White | FreeBSD: The Power to Serve > dwhite@gumbysoft.com | www.FreeBSD.org