Panagiotis Christias
2005-Jun-26 08:14 UTC
Strange SCSI behavior after upgrading from 5.2.1 to 5.4 (and a panic)
Hello, on Thurday we upgraded one of our last 5.2.1 servers to 5.4. Tonight the server panicked, crashed and I had to power it off and on. Here are the logs before the panic: Jun 26 03:45:50 patroklos kernel: (da0:ahc0:0:0:0): lost device Jun 26 03:45:50 patroklos kernel: (da0:ahc0:0:0:0): Invalidating pack Jun 26 03:46:00 patroklos last message repeated 2 times Jun 26 03:46:06 patroklos kernel: initiate_write_filepage: already started Jun 26 03:46:07 patroklos last message repeated 9 times Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): READ(10). CDB: 28 0 72 f 16 0 0 0 80 0 Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): CAM Status: SCSI Status Error Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): SCSI Status: Check Condition Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): UNIT ATTENTION asc:29,0 Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): Power on, reset, or bus device reset occurred Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): Retries Exhausted Jun 26 03:46:07 patroklos kernel: (da0:ahc0:0:0:0): Invalidating pack Jun 26 03:46:31 patroklos kernel: initiate_write_filepage: already started Jun 26 03:46:43 patroklos kernel: panic: initiate_write_inodeblock_ufs2: already started As I said the machine could not recover from the panic so there ais no crashdump. The 5.4 version of the dmesg output is available at: http://noc.ntua.gr/~christia/tmp/dmesg-5.4.txt The 5.2.1 version of the dmesg output is available at: http://noc.ntua.gr/~christia/tmp/dmesg-5.2.1.txt da0 is an 1302GB external IDE to SCSI RAID (8x200GB IDE drives in RAID5 configuration and a SCSI U160 interface). FreeBSD 5.4 connects to da0 at 80MB/s (40.000MHz, offset 31, 16bit, Tagged Queueing Enabled), while FreeBSD 2.5.1 (and FreeBSD 5.3 - just tried to boot with the 5.3-RELEASE-i386-miniinst.iso) connects happily at 160MB/s (80.000MHz, offset 62, 16bit, Tagged Queueing Enabled) which is the transfer rate supported by the RAID device and the SCSI card (Adaptec 3960D Ultra160 SCSI adapter/aic7899). Any ideas what could be or where could be the problem? What has changed in 5.4? We had preserved the 5.2.1 system disks and after the crash we moved back to 5.2.1 until further notice. Now I'm thinking of trying 5.3 which seems to have the same behavior as 5.2.1 and will be still supported for a year or so. Thanks, Panagiotis