Greetings, I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50 configuration. Here is dmesg identifying the controller: 3ware device driver for 9000 series storage controllers, version: 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port 0xb800-0xb8ff mem 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X 2.06.00.009, BIOS BE9X 2.03.01.051 I was getting occasional kernel panics in 5.4 doing high I/O type things (typically an rsync operation). I was told that twa was updated in 5-STABLE, so yesterday I upgraded. I've attempted an rsync twice since the upgrade, both caused a panic. Here is /var/log/messages from just before the reboot last night: Oct 5 23:08:41 leopard kernel: ected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard last message repeated 7 times Oct 5 23:08:41 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg =ected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard last message repeated 106 times Oct 5 23:08:41 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg = 0x15025f32; Missing bits: [MC_RDY,] Oct 5 23:08:41 leopard last message repeated 296 times Oct 5 23:09:42 leopard kernel: twa0: ERROR: (0x05: 0x210b): Request timed out!:request = 0xc2425600 Oct 5 23:09:42 leopard kernel: twa0: INFO: (0x16: 0x1108): Resetting controller...: Oct 5 23:09:42 leopard kernel: twa0: INFO: (0x04: 0x005e): Cache synchronized after power fail: unit=0 Oct 5 23:09:42 leopard kernel: twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 Oct 5 23:09:42 leopard kernel: twa0: INFO: (0x16: 0x1107): Controller reset done!: Oct 5 23:12:59 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg = 0x15025d50; Missing bits: [MC_RDY,] Oct 5 23:13:00 leopard last message repeated 379 times Oct 5 23:13:00 leopard kernel: twa0: ERROR: (0x16: 0x1301): Missing expected status bit(s): status reg = 0x15025d52; Missing bits: [MC_RDY,] Oct 5 23:46:31 leopard syslogd: kernel boot file is /boot/kernel/kernel Please let me know who I may contact or what I can do to get this debugged. Thanks, Dan
On Thursday 06 October 2005 04:07 pm, Dan Rue wrote:> Greetings, > > I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50 > configuration. > > Here is dmesg identifying the controller: > 3ware device driver for 9000 series storage controllers, version: > 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port > 0xb800-0xb8ff mem 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq > 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X 2.06.00.009, > BIOS BE9X 2.03.01.051 > > I was getting occasional kernel panics in 5.4 doing high I/O type > things (typically an rsync operation). I was told that twa was > updated in 5-STABLE, so yesterday I upgraded. I've attempted an > rsync twice since the upgrade, both caused a panic. Here is > /var/log/messages from just before the reboot last night:--- >8 --- SNIP!!! --- >8 --- There's newer vendor driver and firmware. http://www.3ware.com/support/download.asp 1. Driver Select 9550SX series (not 9000 series) and download 9.3.0 version for FreeBSD. It contains newer driver source and binary, which seems to work pretty well with 9000 series controllers as well. 2. Firmware Select 9000 series and download 9.2.1.1 version, which also seems to improve stability. This driver is directly supported by 3ware. http://www.3ware.com/support/support.asp Jung-uk Kim
> -----Original Message----- > From: owner-freebsd-stable@freebsd.org > [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Jung-uk Kim > Sent: Thursday, October 06, 2005 1:30 PM > To: freebsd-stable@FreeBSD.org > Cc: Dan Rue > Subject: Re: twa kernel panic under heavy IO > > On Thursday 06 October 2005 04:07 pm, Dan Rue wrote: > > Greetings, > > > > I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50 > > configuration. > > > > Here is dmesg identifying the controller: > > 3ware device driver for 9000 series storage controllers, version: > > 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port > > 0xb800-0xb8ff mem 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq > > 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X 2.06.00.009, > > BIOS BE9X 2.03.01.051 > > > > I was getting occasional kernel panics in 5.4 doing high I/O type > > things (typically an rsync operation). I was told that twa was > > updated in 5-STABLE, so yesterday I upgraded. I'veGoing by the dmesg, you have a 9.1.5.2 driver and 9.2 firmware. The driver in 5 -STABLE is from the 9.2 release. So, you might not have the driver upgrade done properly. Try using the driver and firmware from the same release. If you still see problems, please contact 3ware support.> attempted an rsync > > twice since the upgrade, both caused a panic. Here is > > /var/log/messages from just before the reboot last night: > --- >8 --- SNIP!!! --- >8 --- > > There's newer vendor driver and firmware. > > http://www.3ware.com/support/download.asp > > 1. Driver > > Select 9550SX series (not 9000 series) and download 9.3.0 > version for FreeBSD. It contains newer driver source and > binary, which seems to work pretty well with 9000 series > controllers as well. > > 2. Firmware > > Select 9000 series and download 9.2.1.1 version, which also > seems to improve stability. > > This driver is directly supported by 3ware. > > http://www.3ware.com/support/support.asp > > Jung-uk Kim > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscribe@freebsd.org" >-------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> -----Original Message----- > From: Brandon Fosdick [mailto:bfoz@bfoz.net] > Sent: Friday, October 07, 2005 6:06 PM > To: Vinod Kashyap > Cc: Jung-uk Kim; freebsd-stable@freebsd.org; Dan Rue > Subject: Re: twa kernel panic under heavy IO > > Vinod Kashyap wrote: > > Going by the dmesg, you have a 9.1.5.2 driver and 9.2 > firmware. The > > driver in 5 -STABLE is from the 9.2 release. So, you might > not have > > the driver upgrade done properly. Try using the driver and > firmware > > from the same release. If you still see problems, please contact > > 3ware support. > > How did you figure out the versions? I'm looking at his > dmesg, and my own, and I'm not seeing the version info in a > recognizable form. I must admit that I haven't been able to > grok the 3ware version numbers at all, so maybe I'm dense. > What am I missing?9.1.5.2 and 9.2 are releases from 3ware, of packages of software and firmware for 9000 series controllers. The driver version he has (2.50.02.012) is from the 9.1.5.2 release, and is part of FreeBSD 5.4. The driver version corresponding to the 9.2 release that's on 5 -STABLE is 3.50.00.017. You can view/download the different releases from 3ware on the 3ware website.>-------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> -----Original Message----- > From: Brandon Fosdick [mailto:bfoz@bfoz.net] > Sent: Friday, October 07, 2005 6:26 PM > To: Vinod Kashyap > Cc: freebsd-stable@freebsd.org; Jung-uk Kim; Dan Rue > Subject: Re: twa kernel panic under heavy IO > > Vinod Kashyap wrote: > >>How did you figure out the versions? I'm looking at his > dmesg, and my > >>own, and I'm not seeing the version info in a recognizable form. I > >>must admit that I haven't been able to grok the 3ware > version numbers > >>at all, so maybe I'm dense. > >>What am I missing? > > > > > > 9.1.5.2 and 9.2 are releases from 3ware, of packages of > software and > > firmware for 9000 series controllers. The driver version he has > > (2.50.02.012) is from the 9.1.5.2 release, and is part of > FreeBSD 5.4. > > The driver version corresponding to the 9.2 release that's on 5 > > -STABLE is 3.50.00.017. You can view/download the > different releases > > from 3ware on the 3ware website. > > Thanks, that helped. I take it that you just have to know > this somehow? Is there a list somewhere that shows the > mapping between FreeBSD, driver, and firmware versions? >Under 'Release Notes to View', select '<release>_Release_Notes_Web', and you will get to a page which lists the version of each individual component that's part of the release, among other things.> The 3ware page has 9.2.1.1, which is what I updated to when I > installed my 9500, but since then I've cvsup'd to 5-stable. > Do I now have mismatched driver-firmware versions? How do I > check this in the future? (besides bothering you about it) >The 9.2* versions are compatible. There are small incompatibilities if you are mixing 9.1* and 9.2*.> And since we're here...is there any reason to update to 9.3 > if I only have a 9500S? >Either way should be fine.> Thanks >-------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> -----Original Message----- > From: Brandon Fosdick [mailto:bfoz@bfoz.net] > Sent: Friday, October 07, 2005 6:50 PM > To: Vinod Kashyap > Cc: freebsd-stable@freebsd.org; Jung-uk Kim; Dan Rue > Subject: Re: twa kernel panic under heavy IO > > Vinod Kashyap wrote: > > Under 'Release Notes to View', select > '<release>_Release_Notes_Web', > > and you will get to a page which lists the version of each > individual > > component that's part of the release, among other things. > > :) I would have known that if Firefox wasn't barfing on PDF's > right now, or if I had bothered to boot my laptop (OSX). > > The "Release Details" section has a note that reads "Linux > and FreeBSD drivers are bundled with firmware 2.08.00.003". > What does that mean? Do the drivers in 5-stable have a copy > of the firmware embedded in them? Is the driver going to try > and downgrade my firmware? >The drivers in the kernel tree are not bundled with the firmware image by default. You can bundle them by turning on a switch in the Makefile. A bundled driver will download the firmware if the running firmware is from a different release (9.1*, 9.2* etc.), or if the running firmware is an older version from the release corresponding to that of the driver.> > The 9.2* versions are compatible. There are small > incompatibilities > > if you are mixing 9.1* and 9.2*. > > > >>And since we're here...is there any reason to update to 9.3 > if I only > >>have a 9500S? > > Good to know. > > Thanks for the quick reply. FWIW I went with 3ware because I > saw somewhere that a 3ware person was supposedly on this > list. Glad to see it wasn't a rumor. >You are encouraged to contact 3ware support with questions related to 3ware. -------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> -----Original Message----- > From: Dan Rue [mailto:drue@therub.org] > Sent: Monday, October 24, 2005 9:14 AM > To: Vinod Kashyap > Cc: freebsd-stable@FreeBSD.org > Subject: Re: twa kernel panic under heavy IO > > On Thu, Oct 06, 2005 at 01:41:38PM -0700, Vinod Kashyap wrote: > > > -----Original Message----- > > > From: owner-freebsd-stable@freebsd.org > > > [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Jung-uk Kim > > > Sent: Thursday, October 06, 2005 1:30 PM > > > To: freebsd-stable@FreeBSD.org > > > Cc: Dan Rue > > > Subject: Re: twa kernel panic under heavy IO > > > > > > On Thursday 06 October 2005 04:07 pm, Dan Rue wrote: > > > > Greetings, > > > > > > > > I am running a 3ware 9500 SATA raid card in a 12x300GB raid 50 > > > > configuration. > > > > > > > > Here is dmesg identifying the controller: > > > > 3ware device driver for 9000 series storage > controllers, version: > > > > 2.50.02.012 twa0: <3ware 9000 series Storage Controller> port > > > > 0xb800-0xb8ff mem > 0xfb800000-0xfbffffff,0xfc5ffc00-0xfc5ffcff irq > > > > 24 at device 2.0 on pci2 twa0: 12 ports, Firmware FE9X > > > > 2.06.00.009, BIOS BE9X 2.03.01.051 > > > > > > > > I was getting occasional kernel panics in 5.4 doing > high I/O type > > > > things (typically an rsync operation). I was told that twa was > > > > updated in 5-STABLE, so yesterday I upgraded. I've > > > > Going by the dmesg, you have a 9.1.5.2 driver and 9.2 > firmware. The > > driver in 5 -STABLE is from the 9.2 release. So, you might > not have > > the driver upgrade done properly. Try using the driver and > firmware > > from the same release. If you still see problems, please contact > > 3ware support. > > Sorry about that, the driver and firmware were not actually > mismatched - I had pasted my dmesg from a previous email when > I was running a different version of FreeBSD. > > --- > > After going around with 3ware web support, this issue has > been concluded, but not resolved. I tried my 3ware 9500 on > FreeBSD 5.3, 5.4, and 5-STABLE. With all of these versions > of OS and driver (i never changed the driver version > manually), I received hard lock ups and reboots (though, > interestingly, no kernel panics). > > 3ware had me check and troubleshoot a number of > possibilities, until they finally decided it was a hardware > problem and issued me a replacement card. However, in the > meantime, I upgraded to FreeBSD > 6.0RC1 and the machine is now working flawlessly. I returned > the replacement card unused. > > I can only conclude that this means that there is a large > (timing?) bug in the twa driver in freebsd 5.3/5.4/5-stable > (as opposed to an isolated hardware problem with my setup). > > I have pasted the full conversation with 3ware on my website > for those interested here: > http://therub.org/9500.txt (sorry for the poor formatting) > > At one point, I received the following error message just > before the machine locked up: > > >Oct 12 11:36:13 leopard kernel: initiate_write_filepage: already > >started > > I grepped for that error message in the freebsd kernel > source, and found it in sys/ufs/ffs/ffs_softdep.c on line > 3580. What makes it really interesting is the comment above > where the error is thrown: > > if (pagedep->pd_state & IOSTARTED) { > /* > * This can only happen if there is a driver that does not > * understand chaining. Here biodone will reissue the call > * to strategy for the incomplete buffers. > */ > printf("initiate_write_filepage: already started\n"); > return; > } > > I know this is a 3ware issue. I am posting this resolution > response here in hopes that it may help someone else that > hits this bug - and with the hope that publically it will get > the attention of the 3ware freebsd driver team/individual. >The error messages you are seeing are consistent with bad hardware. The hardware is becoming unavailable for the driver to talk to it. This other message "initiate_write_filepage..." is different but did you see the machine hang after this message got printed? I don't think it's related to the hang.> Dan >-------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> -----Original Message----- > From: Dan Rue [mailto:drue@therub.org] > Sent: Monday, October 24, 2005 11:23 AM > To: Vinod Kashyap > Cc: freebsd-stable@FreeBSD.org > Subject: Re: twa kernel panic under heavy IO > > On Mon, Oct 24, 2005 at 11:07:28AM -0700, Vinod Kashyap wrote: > > > After going around with 3ware web support, this issue has been > > > concluded, but not resolved. I tried my 3ware 9500 on > FreeBSD 5.3, > > > 5.4, and 5-STABLE. With all of these versions of OS and > driver (i > > > never changed the driver version manually), I received > hard lock ups > > > and reboots (though, interestingly, no kernel panics). > > > > > > 3ware had me check and troubleshoot a number of > possibilities, until > > > they finally decided it was a hardware problem and issued me a > > > replacement card. However, in the meantime, I upgraded to FreeBSD > > > 6.0RC1 and the machine is now working flawlessly. I returned the > > > replacement card unused. > > > > > > I can only conclude that this means that there is a large > > > (timing?) bug in the twa driver in freebsd 5.3/5.4/5-stable (as > > > opposed to an isolated hardware problem with my setup). > > > > > > I have pasted the full conversation with 3ware on my website for > > > those interested here: > > > http://therub.org/9500.txt (sorry for the poor formatting) > > > > > > At one point, I received the following error message just > before the > > > machine locked up: > > > > > > >Oct 12 11:36:13 leopard kernel: initiate_write_filepage: already > > > >started > > > > > > I grepped for that error message in the freebsd kernel > source, and > > > found it in sys/ufs/ffs/ffs_softdep.c on line 3580. What > makes it > > > really interesting is the comment above where the error is thrown: > > > > > > if (pagedep->pd_state & IOSTARTED) { > > > /* > > > * This can only happen if there is a driver that does not > > > * understand chaining. Here biodone will reissue the call > > > * to strategy for the incomplete buffers. > > > */ > > > printf("initiate_write_filepage: already started\n"); > > > return; > > > } > > > > > > I know this is a 3ware issue. I am posting this > resolution response > > > here in hopes that it may help someone else that hits > this bug - and > > > with the hope that publically it will get the attention > of the 3ware > > > freebsd driver team/individual. > > > > > > > The error messages you are seeing are consistent with bad hardware. > > The hardware is becoming unavailable for the driver to talk to it. > > This other message "initiate_write_filepage..." is > different but did > > you see the machine hang after this message got printed? I don't > > think it's related to the hang. > > > > The initiate_write_filepage occured right before the hang. > Here's the full log from that time: > > Oct 6 17:00:32 leopard kernel: twa0: ERROR: (0x16: 0x1301): > Missing expected status bit(s): status reg = 0x15025bb0; > Missing bits: [MC_RDY,] Oct 6 17:00:33 leopard last message > repeated 399 times Oct 6 17:00:36 leopard kernel: ected > status bit(s): status reg = 0x15025bb2; Missing bits: > [MC_RDY,] Oct 6 17:00:36 leopard kernel: twa0: ERROR: (0x16: > 0x1301): Missing expected status bit(s): status reg = > 0x15025bb2; Missing bits: [MC_RDY,] Oct 6 17:00:36 leopard > last message repeated 296 times Oct 6 17:01:37 leopard > kernel: initiate_write_filepage: already started Oct 6 > 17:01:37 leopard last message repeated 83 times Oct 6 > 17:01:37 leopard kernel: twa0: ERROR: (0x05: 0x210b): Request > timed out!: request = 0xc23fb0a0 Oct 6 17:01:37 leopard > kernel: twa0: INFO: (0x16: 0x1108): Resetting controller...: > Oct 6 17:01:37 leopard kernel: twa0: INFO: (0x04: 0x005e): > Cache synchronized after power fail: unit=0 Oct 6 17:01:37 > leopard kernel: twa0: INFO: (0x04: 0x0001): Controller reset > occurred: resets=1 Oct 6 17:01:37 leopard kernel: twa0: > INFO: (0x16: 0x1107): Controller reset done!: >Ok, that message is preceded by those same messages that indicate that the hardware became unavailable. So, that message seems to have been the result of the same hardware issue I mentioned.> > If it's a hardware problem, why would it run fine on 6.0? > The hang was very easy to trigger, and i've put the 6.0 > machine through the gauntlet trying to recreate the problem. >That's a valid question. It could be only a matter of time...> Thanks for looking into this (again) for me, Dan >-------------------------------------------------------- CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.