George William Herbert
2008-Nov-04 00:24 UTC
[zfs-discuss] Bizarre S10U5 / zfs / iscsi / thumper / Oracle RAC problem
I''m looking for any pointers or advice on what might have happened to cause the following problem... Setup: Two X4500 / Sol 10 U5 iSCSI servers, four T1000 S10 U4 -> U5 Oracle RAC DB heads iSCSI clients. iSCSI set up using zfs volumes, set shareiscsi=on, (slightly wierd thing) partitioned disks to get max spindles available for "pseudo-RAID 10" performance zpools (500 gb disks, 465 usable, partitioned 115 GB for "fast" db, 345 for "archive" db, 5 gb for "utility" used for OCR and VOTE partitions in RAC). Disks on each server set up the same way, active zpool disks in 7 "fast" pools ("fast" partition on target 1 on each SATA controller all together in one pool, target 2 on each in second pool, etc) 7 "archive" pools and 7 "utility" pools. "fast" and "utility" are zpool pseudo-RAID 10 "archive" raid-Z. Fixed size zfs volumes built to full capacity of each pool. The clients were S10U4 when we first spotted this, we upgraded them all to S10U5 as soon as we noticed that, but the problem happened again last week. The X4500s have been S10U5 since they were installed. Problem: Both servers have experienced a failure mode which initially manifested as a Oracle RAC crash and proved via testing to be an ignored iSCSI write to "fast" partitions. Test case: (/tmp/zero is a 1-k file full of zero) # dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1 n??ORCLDISK FDATA_0008FDATAFDATA_0008?*?n??*?S??>? ?*5|1+0 records in 1+0 records out # dd of=/dev/rdsk/c2t42d0s6 if=/tmp/zero bs=1k count=1 1+0 records in 1+0 records out # dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1 n??ORCLDISK FDATA_0008FDATAFDATA_0008?*?n??*?S??>? ?*5|1+0 records in 1+0 records out # Once this started happening, the same write behavior appears immediately on all clients, including new ones which had not previously been connected to the iSCSI server. We can write a block of all 0''s, or A''s, out to any of the other iSCSI devices other than the problem one, and read it back fine. But the misbehaving one consistently refuses to actually commit writes, though it takes the write and returns. All reads get the old data. zpool status, zfs list, /var/adm/messages, everything else we look at on the servers say they''re all happy and fine. But obviously there''s something very wrong with the particular volume / pool which is giving us problems. A coworker fixed it the first time by running a manual resilver, once that was underway writes did the right thing again. But that was just a random shot in the dark - we saw no errors or clear reason to resilver. We saw it again, and it blew up the just-about-to-go-live database, and we had to cut over to SAN storage to hit the deploy window. It''s happend on both the X4500s we were using for iSCSI, so it''s not a single point hardware issue. I have preserved the second failed system in error mode in case someone has ideas for more diagnostics. I have an open support ticket, but so far no hint at a solution. Anyone on list have ideas? Thanks.... -george william herbert gherbert at retro.com