-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have two (almost) identical machines with similar problems. These are (old!) Dell PowerApp 100 boxes with an Adaptec 29160LP on a 32-bit PCI riser. It seems that I can prompt a SCSI error with the use of smartctl (to read the drive temperatures). When this poll collides with other disk activity, I get anything from a apparent interrupt loss to a complete dump of the card state. Some details: First box is completely stock Dell-built: ~ DMI type 1, 25 bytes. ~ System Information ~ Manufacturer: Dell Computer Corp. ~ Product Name: PowerApp.web 100 W ~ Version: SPAW70W ~ DMI type 2, 8 bytes. ~ Base Board Information ~ Manufacturer: Intel Corporation ~ Product Name: TR440BXA ~ Version: A16643-305 imb@sarah:/home/imb# lspci 00:00.0 Host bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled) (rev 03) 00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02) 00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01) 00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01) 00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02) 00:0c.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) 00:0d.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) 00:0e.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200 (rev 01) 00:0f.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02) ~From dmesg: ahc0: <Adaptec 29160B Ultra160 SCSI adapter> port 0xe800-0xe8ff mem 0xfebfb000-0xfebfbfff irq 10 at device 15.0 on pci0 aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs ~ [ .. deletia .. ] da0 at ahc0 bus 0 target 0 lun 0 da0: <QUANTUM ATLAS_V_18_WLS 0200> Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 17510MB (35861388 512 byte sectors: 255H 63S/T 2232C) da1 at ahc0 bus 0 target 1 lun 0 da1: <QUANTUM ATLAS_V_18_WLS 0200> Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 17510MB (35861388 512 byte sectors: 255H 63S/T 2232C) ~ .. yet it occasionally reports .. ahc0: Timedout SCBs already complete. Interrupts may not be functioning. The second machine is identical other than a pair of U320s running at U160 .. da0 at ahc0 bus 0 target 0 lun 0 da0: <FUJITSU MAP3147NP 5608> Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled da0: 140014MB (286749480 512 byte sectors: 255H 63S/T 17849C) da1 at ahc0 bus 0 target 1 lun 0 da1: <FUJITSU MAP3147NP 5608> Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled da1: 140014MB (286749480 512 byte sectors: 255H 63S/T 17849C) ~ .. and it ocassionally reports as above but, since I beat up on these disks more, it also fails with a complete state dump of the card (as attached). I tried updating the 29160LP firmware from the Dell-supplied rev 2.57 to 3.10 with no impact on the problem (so it's back the way it was). I checked terminations and cabling - all as built by Dell - <shrug> Any ideas as to how to approach this would be more than welcome, Michael Butler CISSP Security Consultant PGP Key ID: 0x5E873CC5 Fingerprint: 2CFF 581F D192 F885 7ED9 3C44 889C A479 5E87 3CC5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (MingW32) iD8DBQFCzU9BiJykeV6HPMURAqXRAJ92gqCeFFUKE91BAVPst/JxC1GVCQCeLPxM xWYXNtT921qalLpPgBQ1M/8=sPk6 -----END PGP SIGNATURE----- -------------- next part -------------- ahc0: Recovery Initiated>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<ahc0: Dumping Card State while idle, at SEQADDR 0x8 Card was paused ACCUM = 0x4, SINDEX = 0x64, DINDEX = 0x65, ARG_2 = 0x23 HCNT = 0x0 SCBPTR = 0x1c SCSIPHASE[0x8]:(MSG_IN_PHASE) SCSISIGI[0xe6]:(REQI|BSYI|MSGI|IOI|CDI) ERROR[0x0] SCSIBUSL[0x80] LASTPHASE[0x1]:(P_BUSFREE) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI) SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x20]:(SELDI) SSTAT1[0x11]:(REQINIT|PHASEMIS) SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP) SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON) DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) STACK: 0xe2 0x164 0x179 0x3 SCB count = 254 Kernel NEXTQSCB = 247 Card NEXTQSCB = 247 QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 28:142 1:43 QOUTFIFO entries: Sequencer Free SCB List: 27 22 14 31 11 9 0 5 25 29 6 2 23 20 24 12 10 26 18 21 16 15 3 7 30 13 4 17 8 19 Sequencer SCB Info: 0 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 1 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0x2b] 2 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 3 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 4 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 5 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 6 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 7 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 8 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 9 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 10 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 11 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 12 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 13 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 14 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 15 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 16 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 17 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 18 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 19 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 20 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 21 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 22 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 23 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 24 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 25 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 26 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 27 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 28 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0x8e] 29 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 30 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17] SCB_LUN[0x0] SCB_TAG[0xff] 31 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] Pending list: 142 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0] 43 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0] Kernel Free SCB list: 55 13 12 223 89 24 22 127 118 189 201 42 176 1 79 167 231 63 83 232 107 238 204 162 219 104 93 158 140 166 28 26 163 122 160 159 139 16 124 91 84 97 64 52 146 233 165 11 168 17 179 86 180 25 9 177 230 151 197 216 183 33 149 66 81 61 35 92 228 7 4 58 210 198 125 187 40 164 161 171 106 199 145 59 150 190 134 5 236 253 57 224 39 132 105 202 217 213 215 241 115 212 103 243 251 192 152 0 8 182 220 157 239 188 203 174 226 208 169 72 20 113 218 14 154 73 32 246 47 119 117 195 13 6 48 56 102 31 77 123 196 3 155 135 214 49 172 221 194 2 94 222 130 133 27 156 46 99 19 211 148 67 76 184 173 137 234 100 36 170 30 10 252 78 70 111 229 235 101 15 108 175 71 120 23 193 65 114 249 112 44 126 248 88 138 80 68 87 38 128 207 191 6 178 147 116 29 82 1 44 69 53 240 185 129 95 131 75 200 153 206 244 227 110 45 109 121 90 181 242 143 186 209 225 50 41 34 51 245 74 18 98 141 237 60 250 54 37 205 62 85 96 21 <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> (pass1:ahc0:0:1:0): SCB 0x8e - timed out sg[0] - Addr 0x15bccc60 : Length 36 (pass1:ahc0:0:1:0): Queuing a BDR SCB (pass1:ahc0:0:1:0): Bus Device Reset Message Sent (pass1:ahc0:0:1:0): no longer in timeout, status = 24b ahc0: Bus Device Reset on A:1. 1 SCBs aborted ahc0: Timedout SCBs already complete. Interrupts may not be functioning. (da1:ahc0:0:1:0): WRITE(10). CDB: 2a 0 0 44 1d c2 0 0 20 0 (da1:ahc0:0:1:0): CAM Status: SCSI Status Error (da1:ahc0:0:1:0): SCSI Status: Check Condition (da1:ahc0:0:1:0): UNIT ATTENTION asc:29,3 (da1:ahc0:0:1:0): Bus device reset function occurred (da1:ahc0:0:1:0): Retrying Command (per Sense Data)