Hi. I installed solaris express developer edition (b79) on a supermicro quad-core harpertown E5405 with 8 GB ram and two internal sata-drives. I installed solaris onto one of the internal drives. I added an areca arc-1680 sas-controller and configured it in jbod-mode. I attached an external sas-cabinet with 16 sas-drives 1 TB (931 binary GB). I created a raidz2-pool with ten disks and one spare. I then copied some 400 GB of small files each approx. 1 MB. To simulate a disk-crash I pulled one disk out of the cabinet and zfs faulted the drive and used the spare and started a resilver. During the resilver-process one of the remaining disks had a checksum-error and was marked as degraded. The zpool is now unavailable. I first tried to add another spare but got I/O-error. I then tried to replace the degraded disk by adding a new one: # zpool add ef1 c3t1d3p0 cannot open ''/dev/dsk/c3t1d3p0'': I/O error Partial dmesg: Jul 25 13:14:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout Jul 25 13:14:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi id=1 lun=3 fatal error on target, device was gone Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=1 Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=0 Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi id=8 lun=0 ccb=''0xffffff02e0c8be00'' outstanding command timeout Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi id=8 lun=0 fatal error on target, device was gone Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi id=0 lun=0 ccb=''0xffffff02e0c92a00'' outstanding command timeout Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi id=0 lun=0 fatal error on target, device was gone Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=1 Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=0 Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi id=0 lun=5 ccb=''0xffffff02e0c97200'' outstanding command timeout Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi id=0 lun=5 fatal error on target, device was gone Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=1 Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=0 Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi id=1 lun=3 fatal error on target, device was gone Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=1 Jul 25 13:15:00 malene arcmsr: [ID 658202 kern.warning] WARNING: arcmsr0: tran reset level=0 Jul 25 13:15:00 malene scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,25f9 at 6/pci10b5,8533 at 0/pci10b5,8533 at 9/pci17d3,1680 at 0/sd at 1,3 (sd8): Jul 25 13:15:00 malene offline or reservation conflict /usr/sbin/zpool status pool: ef1 state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device repaired. scrub: resilver in progress, 0.02% done, 5606h29m to go config: NAME STATE READ WRITE CKSUM ef1 DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 spare ONLINE 0 0 0 c3t0d0p0 ONLINE 0 0 0 c3t1d2p0 ONLINE 0 0 0 c3t0d1p0 ONLINE 0 0 0 c3t0d2p0 ONLINE 0 0 0 c3t0d0p0 FAULTED 35 1.61K 0 too many errors c3t0d4p0 ONLINE 0 0 0 c3t0d5p0 DEGRADED 0 0 34 too many errors c3t0d6p0 ONLINE 0 0 0 c3t0d7p0 ONLINE 0 0 0 c3t1d0p0 ONLINE 0 0 0 c3t1d1p0 ONLINE 0 0 0 spares c3t1d2p0 INUSE currently in use errors: No known data errors When I try to start cli64 to access the arc-1680-card it hangs as well. Is this a deficiency in the arcmsr-driver? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
Hi Claus, Claus Guttesen wrote:> Hi. > > I installed solaris express developer edition (b79) on a supermicro > quad-core harpertown E5405 with 8 GB ram and two internal sata-drives. > I installed solaris onto one of the internal drives. I added an areca > arc-1680 sas-controller and configured it in jbod-mode. I attached an > external sas-cabinet with 16 sas-drives 1 TB (931 binary GB). I > created a raidz2-pool with ten disks and one spare. I then copied some > 400 GB of small files each approx. 1 MB. To simulate a disk-crash I > pulled one disk out of the cabinet and zfs faulted the drive and used > the spare and started a resilver.I''m not convinced that this is a valid test; yanking a disk out will have physical-layer effects apart from removing the device from your system. I think relling or roch would have something to say on this also.> During the resilver-process one of the remaining disks had a > checksum-error and was marked as degraded. The zpool is now > unavailable. I first tried to add another spare but got I/O-error. I > then tried to replace the degraded disk by adding a new one: > > # zpool add ef1 c3t1d3p0 > cannot open ''/dev/dsk/c3t1d3p0'': I/O error > > Partial dmesg: > > Jul 25 13:14:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi > id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout > Jul 25 13:14:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi > id=1 lun=3 fatal error on target, device was gone > Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: > arcmsr0: tran reset level=1tran reset with level=1 is a bus reset> Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: > arcmsr0: tran reset level=0tran reset with level=0 is a target-specific reset, which arcmsr doesn''t support. ...> Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi > id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout > Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi > id=1 lun=3 fatal error on target, device was goneThe command timed out because your system configuration was unexpectedly changed in a manner which arcmsr doesn''t support. ....> /usr/sbin/zpool status > pool: ef1 > state: DEGRADED > status: One or more devices are faulted in response to persistent errors. > Sufficient replicas exist for the pool to continue functioning in a > degraded state. > action: Replace the faulted device, or use ''zpool clear'' to mark the device > repaired. > scrub: resilver in progress, 0.02% done, 5606h29m to go > config: > > NAME STATE READ WRITE CKSUM > ef1 DEGRADED 0 0 0 > raidz2 DEGRADED 0 0 0 > spare ONLINE 0 0 0 > c3t0d0p0 ONLINE 0 0 0 > c3t1d2p0 ONLINE 0 0 0 > c3t0d1p0 ONLINE 0 0 0 > c3t0d2p0 ONLINE 0 0 0 > c3t0d0p0 FAULTED 35 1.61K 0 too many errors > c3t0d4p0 ONLINE 0 0 0 > c3t0d5p0 DEGRADED 0 0 34 too many errors > c3t0d6p0 ONLINE 0 0 0 > c3t0d7p0 ONLINE 0 0 0 > c3t1d0p0 ONLINE 0 0 0 > c3t1d1p0 ONLINE 0 0 0 > spares > c3t1d2p0 INUSE currently in use > > errors: No known data errorsa double disk failure while resilvering - not a good state for your pool to be in. Can you wait for the resilver to complete? Every minute that goes by tends to decrease the estimate on how long remains. In addition, why are you using p0 devices rather than GPT-labelled disks (or whole-disk s0 slices) ?> When I try to start cli64 to access the arc-1680-card it hangs as well. > Is this a deficiency in the arcmsr-driver?I''ll quibble - "this" can mean several things. Yes, there seems to be an issue with arcmsr''s handling of uncoordinated device removal. I advise against doing this I don''t know how cli64 works and you haven''t provided any messages output from the system at the time when "it hangs" - is that the cli64 util, the system, your zpool?... For interest - which version of arcmsr are you running? James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
>> I installed solaris express developer edition (b79) on a supermicro >> quad-core harpertown E5405 with 8 GB ram and two internal sata-drives. >> I installed solaris onto one of the internal drives. I added an areca >> arc-1680 sas-controller and configured it in jbod-mode. I attached an >> external sas-cabinet with 16 sas-drives 1 TB (931 binary GB). I >> created a raidz2-pool with ten disks and one spare. I then copied some >> 400 GB of small files each approx. 1 MB. To simulate a disk-crash I >> pulled one disk out of the cabinet and zfs faulted the drive and used >> the spare and started a resilver. > > I''m not convinced that this is a valid test; yanking a disk out > will have physical-layer effects apart from removing the device > from your system. I think relling or roch would have something > to say on this also.In later tests I will use zpool to off-line the disk instead. Thank you for pointing this out.>> During the resilver-process one of the remaining disks had a >> checksum-error and was marked as degraded. The zpool is now >> unavailable. I first tried to add another spare but got I/O-error. I >> then tried to replace the degraded disk by adding a new one: >> >> # zpool add ef1 c3t1d3p0 >> cannot open ''/dev/dsk/c3t1d3p0'': I/O error >> >> Partial dmesg: >> >> Jul 25 13:14:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi >> id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout >> Jul 25 13:14:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi >> id=1 lun=3 fatal error on target, device was gone >> Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: >> arcmsr0: tran reset level=1 > > tran reset with level=1 is a bus reset > >> Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: >> arcmsr0: tran reset level=0 > > tran reset with level=0 is a target-specific reset, which arcmsr > doesn''t support. > > ... > >> Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi >> id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout >> Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi >> id=1 lun=3 fatal error on target, device was gone > > The command timed out because your system configuration was unexpectedly > changed in a manner which arcmsr doesn''t support.Are there alternative jbod-capable sas-controllers in the same range as the arc-1680? That is compatible with solaris? I choosed the arc-1680 since it''s well-supported on FreeBSD and Solaris.>> /usr/sbin/zpool status >> pool: ef1 >> state: DEGRADED >> status: One or more devices are faulted in response to persistent errors. >> Sufficient replicas exist for the pool to continue functioning in a >> degraded state. >> action: Replace the faulted device, or use ''zpool clear'' to mark the >> device >> repaired. >> scrub: resilver in progress, 0.02% done, 5606h29m to go >> config: >> >> NAME STATE READ WRITE CKSUM >> ef1 DEGRADED 0 0 0 >> raidz2 DEGRADED 0 0 0 >> spare ONLINE 0 0 0 >> c3t0d0p0 ONLINE 0 0 0 >> c3t1d2p0 ONLINE 0 0 0 >> c3t0d1p0 ONLINE 0 0 0 >> c3t0d2p0 ONLINE 0 0 0 >> c3t0d0p0 FAULTED 35 1.61K 0 too many errors >> c3t0d4p0 ONLINE 0 0 0 >> c3t0d5p0 DEGRADED 0 0 34 too many errors >> c3t0d6p0 ONLINE 0 0 0 >> c3t0d7p0 ONLINE 0 0 0 >> c3t1d0p0 ONLINE 0 0 0 >> c3t1d1p0 ONLINE 0 0 0 >> spares >> c3t1d2p0 INUSE currently in use >> >> errors: No known data errors > > a double disk failure while resilvering - not a good state for your > pool to be in.The degraded disk came after I pulled the first disk and was not intended. :-)> Can you wait for the resilver to complete? Every minute that goes > by tends to decrease the estimate on how long remains.The resilver had approx. three hours remaining when the second disk was marked as degraded. After that the resilver process (and access as such) to the raidz2-pool stopped.> In addition, why are you using p0 devices rather than GPT-labelled > disks (or whole-disk s0 slices) ?My ignorance. I''m a fairly seasoned FreeBSD-administrator and had previously used da0, da1, da2 etc. when I defined a similar raidz2 on FreeBSD. But when I installed solaris I initially saw lun 0 on target 0 and 1 and then tried the devices that I saw. And the p0-device in /dev/dsk was the first to respond to my zpool create-command. :^) Modifying /kernel/drv/sd.conf made all the lun''s visible. Solaris is a different kind of animal. I have destroyed and created a new raidz2 using the c3t0d0, c3t0d1, c3t0d2 etc. devices instead.> I don''t know how cli64 works and you haven''t provided any messages output > from the system at the time when "it hangs" - is that the cli64 util, > the system, your zpool?...I tried to start the program but it hung. Here is an example when I can access the utility: CLI> disk info # Enc# Slot# ModelName Capacity Usage ============================================================================== 1 01 Slot#1 N.A. 0.0GB N.A. 2 01 Slot#2 N.A. 0.0GB N.A. 3 01 Slot#3 N.A. 0.0GB N.A. 4 01 Slot#4 N.A. 0.0GB N.A. 5 01 Slot#5 N.A. 0.0GB N.A. 6 01 Slot#6 N.A. 0.0GB N.A. 7 01 Slot#7 N.A. 0.0GB N.A. 8 01 Slot#8 N.A. 0.0GB N.A. 9 02 SLOT 000 SEAGATE ST31000640SS 1000.2GB JBOD 10 02 SLOT 001 SEAGATE ST31000640SS 1000.2GB JBOD 11 02 SLOT 002 SEAGATE ST31000640SS 1000.2GB JBOD 12 02 SLOT 003 SEAGATE ST31000640SS 1000.2GB JBOD 13 02 SLOT 004 SEAGATE ST31000640SS 1000.2GB JBOD 14 02 SLOT 005 SEAGATE ST31000640SS 1000.2GB JBOD 15 02 SLOT 006 SEAGATE ST31000640SS 1000.2GB JBOD 16 02 SLOT 007 SEAGATE ST31000640SS 1000.2GB JBOD 17 02 SLOT 008 SEAGATE ST31000640SS 1000.2GB JBOD 18 02 SLOT 009 SEAGATE ST31000640SS 1000.2GB JBOD 19 02 SLOT 010 SEAGATE ST31000640SS 1000.2GB JBOD 20 02 SLOT 011 SEAGATE ST31000640SS 1000.2GB JBOD 21 02 SLOT 012 SEAGATE ST31000640SS 1000.2GB JBOD 22 02 SLOT 013 SEAGATE ST31000640SS 1000.2GB JBOD 23 02 SLOT 014 SEAGATE ST31000640SS 1000.2GB JBOD 24 02 SLOT 015 SEAGATE ST31000640SS 1000.2GB JBOD ==============================================================================> For interest - which version of arcmsr are you running?I''m running the version that was supplied on the CD, this is 1.20.00.15 from 2007-04-04. The firmware is V1.45 from 2008-3-27. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
Chad Leigh -- Shire.Net LLC
2008-Jul-25 13:41 UTC
[zfs-discuss] zfs, raidz, spare and jbod
On Jul 25, 2008, at 7:27 AM, Claus Guttesen wrote:> I''m running the version that was supplied on the CD, this is > 1.20.00.15 from 2007-04-04. The firmware is V1.45 from 2008-3-27.Check the version at the Areca website. They may have a more recent driver there. The dates are later for the 1.20.00.15 and there is a -71010 extension. Otherwise, file a bug with Areca. They are pretty good about responding. Chad --- Chad Leigh -- Shire.Net LLC Your Web App and Email hosting provider chad at shire.net
>> I''m running the version that was supplied on the CD, this is >> 1.20.00.15 from 2007-04-04. The firmware is V1.45 from 2008-3-27. > > Check the version at the Areca website. They may have a more recent driver > there. The dates are later for the 1.20.00.15 and there is a -71010 > extension. > > Otherwise, file a bug with Areca. They are pretty good about responding.I actually tried this driver as well but from the file pkginfo the driver from the ftp-server is VERSION=1.20.00.13,REV=2006.08.14 where the supplied driver is VERSION=1.20.00.15,REV=2007.08.14. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
>>>>> "jcm" == James C McPherson <James.McPherson at Sun.COM> writes:jcm> I''m not convinced that this is a valid test; yanking a disk it is the ONLY valid test. it''s just testing more than ZFS. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080725/8029dbee/attachment.bin>
Miles Nordin wrote:>>>>>> "jcm" == James C McPherson <James.McPherson at Sun.COM> writes: >>>>>> > > jcm> I''m not convinced that this is a valid test; yanking a disk > > it is the ONLY valid test. it''s just testing more than ZFS. >disagree. It is only a test of the failure mode of yanking a disk. I will submit that this failure mode is often best solved by door locks, not software. FWIW, I did post a Pareto chart of disk failure modes we measured in our installed base over a large sample size. http://blogs.sun.com/relling/entry/zfs_copies_and_data_protection "yanking a disk" did not even make the "other" category. -- richard
>>>>> "re" == Richard Elling <Richard.Elling at Sun.COM> writes:re> I will submit that this failure mode is often best re> solved by door locks, not software. First, not just door locks, but: * redundant power supplies * sleds and Maintain Me, Please lights * high-strung extremely conservative sysadmins who take months to do small jobs and demand high salaries * racks, pedestals, separate rooms, mains wiring diversity in short, all the costly and cumbersome things ZFS is supposed to make optional. Secondly, from skimming the article you posted, ``did not even make the Other category'''' in this case seems to mean the study doesn''t consider it, not that you captured some wholistic reliability data and found that it didn''t occur. Thirdly, as people keep saying over and over in here, the reason they pull drives is to simulate the kind of fails-to-spin, fails-to-IDENTIFY, spews garbage onto the bus drive that many of us have seen cause lower-end systems to do weird things. If it didn''t happen, we wouldn''t have *SEEN* it, and wouldn''t be trying to simulate it. You can''t make me distrust my own easily-remembered experience from like two months ago by plotting some bar chart. A month ago you were telling us these tiny boards with some $10 chinese chip that split one SATA connector into two, built into Sun''s latest JBOD drive sleds, are worth a 500% markup on 1TB drives because in the real world, cables fail, controllers fail, drives spew garbage onto busses, therefore simple fan-out port multipliers are not good enough---you need this newly-conceived ghetto-multipath. Now you''re telling me failed controllers, cables, and drive firmware is allowed to lock a whole kernel because it ``doesn''t even make the Other category.'''' sorry, that does not compute. I think I''m going to want a ``simulate channel A failure'''' button on this $700 sled. If only the sled weren''t so expensive I could simulate it myself by sanding off the resist and scribbling over the traces with a pencil or something. I basically don''t trust any of it any more, and I''ll stop pulling drives when I have a drive-failure-simulator I trust more than that procedure. ''zpool offline'' is not a drive-failure-simulator---I''ve already established on my own system it''s very different, and there is at least one fix going into b94 trying to close that gap. I''m sorry, this is just ridiculous. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080725/7d744abd/attachment.bin>
On Fri, 25 Jul 2008, Miles Nordin wrote:> I think I''m going to want a ``simulate channel A failure'''' button on > this $700 sled. If only the sled weren''t so expensive I couldWhy don''t you just purchase the smallest possible drive from Sun and replace it with a cheap graymarket 1.5TB drive from some random place on the net? Then install the small drive in your home PC. That is what the rest of us who don''t care about reliability, warranty, or service, do (but the home PC runs great!). Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
>>>>> "bf" == Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:bf> purchase the smallest possible drive right, good point. The failed-channel-simulator could be constructed from the smallest drive/sled module. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080725/f5030484/attachment.bin>
Miles Nordin wrote:>>>>>> "re" == Richard Elling <Richard.Elling at Sun.COM> writes: >>>>>> > > re> I will submit that this failure mode is often best > re> solved by door locks, not software. > > First, not just door locks, but: > > * redundant power supplies > > * sleds and Maintain Me, Please lights > > * high-strung extremely conservative sysadmins who take months to do > small jobs and demand high salaries > > * racks, pedestals, separate rooms, mains wiring diversity > > in short, all the costly and cumbersome things ZFS is supposed to make > optional. > >:-) I don''t think it is in the ZFS design scope to change diversity...> Secondly, from skimming the article you posted, ``did not even make > the Other category'''' in this case seems to mean the study doesn''t > consider it, not that you captured some wholistic reliability data and > found that it didn''t occur. >You are correct that in the samples we collected, we had no records of disks spontaneously falling out of the system. The failures we collected for this study were those not caused by service actions.> Thirdly, as people keep saying over and over in here, the reason they > pull drives is to simulate the kind of fails-to-spin, > fails-to-IDENTIFY, spews garbage onto the bus drive that many of us > have seen cause lower-end systems to do weird things. If it didn''t > happen, we wouldn''t have *SEEN* it, and wouldn''t be trying to simulate > it. You can''t make me distrust my own easily-remembered experience > from like two months ago by plotting some bar chart. >What happens when the device suddenly disappears is that the device selection fails. This exercises a code path that is relatively short and does the obvious. A failure to spin exercises a very different code path because the host can often talk to the disk, but the disk itself is sick.> A month ago you were telling us these tiny boards with some $10 > chinese chip that split one SATA connector into two, built into Sun''s > latest JBOD drive sleds, are worth a 500% markup on 1TB drives because > in the real world, cables fail, controllers fail, drives spew garbage > onto busses, therefore simple fan-out port multipliers are not good > enough---you need this newly-conceived ghetto-multipath. Now you''re > telling me failed controllers, cables, and drive firmware is allowed > to lock a whole kernel because it ``doesn''t even make the Other > category.'''' sorry, that does not compute. >I believe the record will show that there are known bugs in the Marvell driver which have caused this problem for SATA drives. In the JBOD sled case, this exact problem would not exist because you hot-plug to SAS interfaces, not SATA interfaces -- different controller and driver.> I think I''m going to want a ``simulate channel A failure'''' button on > this $700 sled. If only the sled weren''t so expensive I could > simulate it myself by sanding off the resist and scribbling over the > traces with a pencil or something. I basically don''t trust any of it > any more, and I''ll stop pulling drives when I have a > drive-failure-simulator I trust more than that procedure. ''zpool > offline'' is not a drive-failure-simulator---I''ve already established > on my own system it''s very different, and there is at least one fix > going into b94 trying to close that gap. > > I''m sorry, this is just ridiculous. >With parallel SCSI this was a lot easier -- we could just wire a switch into the bus and cause stuck-at faults quite easily. With SAS and SATA it is more difficult because they only share differential pairs in a point-to-point link. There is link detection going on all of the time which precludes testing for stuck-at faults. Each packet has CRCs, so in order to induce a known bad packet for testing you''ll have to write some code which makes intentionally bad packets. But this will only really test the part of the controller chip which does CRC validation, which is, again, probably not what you want. It actually works a lot more like Ethernet, which also has differential signalling, link detection, and CRCs. But if you really just want to do fault injections, then you should look at ztest, http://opensolaris.org/os/community/zfs/ztest/ though it is really a ZFS code-path exerciser and not a Marvell driver path exerciser. If you want to test the Marvell code path then you might look at project COMSTAR which will allow you to configure another host to look like a disk and then you can make all sorts of simulated disk faults by making unexpected responses, borken packets, really slow responses, etc. http://opensolaris.org/os/project/comstar/ -- richard
Claus Guttesen wrote: ...>>> Jul 25 13:15:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi >>> id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout >>> Jul 25 13:15:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi >>> id=1 lun=3 fatal error on target, device was gone >> The command timed out because your system configuration was unexpectedly >> changed in a manner which arcmsr doesn''t support. > > Are there alternative jbod-capable sas-controllers in the same range > as the arc-1680? That is compatible with solaris? I choosed the > arc-1680 since it''s well-supported on FreeBSD and Solaris.I don''t know, quite probably :) Have a look at the the HCL for both Solaris 10, Solaris Express and OpenSolaris 2008.05 - http://www.sun.com/bigadmin/hcl/ http://www.sun.com/bigadmin/hcl/data/sx/ http://www.sun.com/bigadmin/hcl/data/os/>>> /usr/sbin/zpool status >>> pool: ef1 >>> state: DEGRADED >>> status: One or more devices are faulted in response to persistent errors. >>> Sufficient replicas exist for the pool to continue functioning in a >>> degraded state. >>> action: Replace the faulted device, or use ''zpool clear'' to mark the >>> device >>> repaired. >>> scrub: resilver in progress, 0.02% done, 5606h29m to go >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> ef1 DEGRADED 0 0 0 >>> raidz2 DEGRADED 0 0 0 >>> spare ONLINE 0 0 0 >>> c3t0d0p0 ONLINE 0 0 0 >>> c3t1d2p0 ONLINE 0 0 0 >>> c3t0d1p0 ONLINE 0 0 0 >>> c3t0d2p0 ONLINE 0 0 0 >>> c3t0d0p0 FAULTED 35 1.61K 0 too many errors >>> c3t0d4p0 ONLINE 0 0 0 >>> c3t0d5p0 DEGRADED 0 0 34 too many errors >>> c3t0d6p0 ONLINE 0 0 0 >>> c3t0d7p0 ONLINE 0 0 0 >>> c3t1d0p0 ONLINE 0 0 0 >>> c3t1d1p0 ONLINE 0 0 0 >>> spares >>> c3t1d2p0 INUSE currently in use >>> >>> errors: No known data errors >> a double disk failure while resilvering - not a good state for your >> pool to be in. > > The degraded disk came after I pulled the first disk and was not intended. :-)That''s usually the case :)>> Can you wait for the resilver to complete? Every minute that goes >> by tends to decrease the estimate on how long remains. > > The resilver had approx. three hours remaining when the second disk > was marked as degraded. After that the resilver process (and access as > such) to the raidz2-pool stopped.I think that''s probably to be expected.>> In addition, why are you using p0 devices rather than GPT-labelled >> disks (or whole-disk s0 slices) ? > > My ignorance. I''m a fairly seasoned FreeBSD-administrator and had > previously used da0, da1, da2 etc. when I defined a similar raidz2 on > FreeBSD. But when I installed solaris I initially saw lun 0 on target > 0 and 1 and then tried the devices that I saw. And the p0-device in > /dev/dsk was the first to respond to my zpool create-command. :^)Not to worry - every OS handles things a little different in that area.> Modifying /kernel/drv/sd.conf made all the lun''s visible.Yes - by default the Areca will only present targets, not any luns underneath so sd.conf modification is necessary. I''m working on getting that fixed. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
> I installed solaris express developer edition (b79) on a supermicro > quad-core harpertown E5405 with 8 GB ram and two internal sata-drives. > I installed solaris onto one of the internal drives. I added an areca > arc-1680 sas-controller and configured it in jbod-mode. I attached an > external sas-cabinet with 16 sas-drives 1 TB (931 binary GB). I > created a raidz2-pool with ten disks and one spare. I then copied some > 400 GB of small files each approx. 1 MB. To simulate a disk-crash I > pulled one disk out of the cabinet and zfs faulted the drive and used > the spare and started a resilver. > > During the resilver-process one of the remaining disks had a > checksum-error and was marked as degraded. The zpool is now > unavailable. I first tried to add another spare but got I/O-error. I > then tried to replace the degraded disk by adding a new one: > > # zpool add ef1 c3t1d3p0 > cannot open ''/dev/dsk/c3t1d3p0'': I/O error > > Partial dmesg: > > Jul 25 13:14:00 malene arcmsr: [ID 419778 kern.notice] arcmsr0: scsi > id=1 lun=3 ccb=''0xffffff02e0ca0800'' outstanding command timeout > Jul 25 13:14:00 malene arcmsr: [ID 610198 kern.notice] arcmsr0: scsi > id=1 lun=3 fatal error on target, device was gone > Jul 25 13:14:00 malene arcmsr: [ID 658202 kern.warning] WARNING: > arcmsr0: tran reset level=1 > > Is this a deficiency in the arcmsr-driver?I beleive I have found the problem. I tried to define a raid-5-volume on the arc-1680-card and still saw errors as mentioned above. Areca-support suggested that I upgraded to the lastest solaris-drivers (located in the beta-folder) and upgraded firmware as well. I did both and it somewhat solved my problems but I had very poor write-performance, 2-6 MB/s. So I deleted my zpool and changed the arc-1680-configuration and put all disks in passthrough-mode. I created a new zpool and performed similar tests and have not experienced any abnormal behaviour. I''m re-installing the server with FreeBSD and will do similar tests and report back. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare
Hi, One of the thing you could have done to continue the resilver is "zpool clear" This would have let you continue to replace the drive you pulled out. Once that was done you could have them figured out what was wrong with the second faulty drive. The second drive only had check sum errors, ZFS was doing it job, the data on the was usable. zpool clear would have keeped the pool online, all be it with lots of complaints. I have had to use zpool clear multiple times on one of my zpools after a PSU failure took out a HDD and damaged another. Mobo and ram died too :( The damaged drive racked up thousands and thousands of errors while i replaced the dead drive. in the end i only lost one small file. Am no expert, but thats how I got round a similar problem. This message posted from opensolaris.org
I had the same problem described by kometen with our Areca ARC-1680 controller on opensolaris 2008.05. We were using the controller in JBOD mode and allowing zpool to to use entire disks. Setting the drives in pass-through mode on the Areca controller manager solved the issue. Also worthy to note, after the zpool degraded under operation in JBOD and subsequently reconfiguring the controller to pass-through each disk, zpool import was able to recover the corrupted zpool. -- This message posted from opensolaris.org
Could you explain if you did any specific configuration on the Areca Raid controller other than setting it to Raid and manually marking every disk as pass-trhough so that the disks are viewable from opensolaris? I have an ARC-1680ix-16. I have tried two configurations. JBOD and RAID but making all drives pass-through as you suggest. In both Instances, I can only see two hard drives from opensolaris. I have tried a RHEL 5.2 Installation with the Areca controller and I can see all the drives using RAID with pass-through. Do I need any boot parameters for opensolaris or something else? The Areca controller assigned the following settings for the drives when configured as pass-through Channel-SCSI_ID-LUN Disk# 0-0-0 01 0-0-1 02 0-0-2 03 0-0-3 04 0-0-4 05 0-0-5 06 0-0-6 07 0-0-7 08 ---------- 0-1-0 09 0-1-1 10 0-1-2 11 0-1-3 12 0-1-4 13 0-1-5 14 0-1-6 15 0-1-7 16 My Firmware is 1.45 and I am using the areca driver that comes with opensolaris 2008.11 Any help would be greatly appreciated. -- This message posted from opensolaris.org
I have installed Opensolaris build 129 on our server. It has 12 disk at a Areca 1130 controller. Using the latest firmware. I have put all the disk in jbod and running them in raidz2. After a while the systems hangs with arcmsr0: tran reset )level 0x1) called for target4 lun 0 target reset not supported What can I do? I really want it to work! (I am gone set all the disk to pass-through monday) -- This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: solaris troubles.jpg Type: image/jpeg Size: 145619 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100109/e5e26997/attachment.jpg> -------------- next part -------------- A non-text attachment was scrubbed... Name: solaris troubles2.jpg Type: image/jpeg Size: 227387 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100109/e5e26997/attachment-0001.jpg>
We had a similar problem on Areca 1680. It was caused by a drive that didn''t properly reset (took ~2 seconds each time according to the drive tray''s led). Replacing the drive solved this problem, but then we hit another problem which you can see in this thread : http://opensolaris.org/jive/thread.jspa?threadID=121335&tstart=0 I''m curious wether you have a similar setup and encounter the same problems. How did you setup your pools ? Please tell me if you have any luck setting the drives to pass-through. Thanks, Arnaud Le 09/01/10 14:26, Rob a ?crit :> I have installed Opensolaris build 129 on our server. It has 12 disk at a Areca 1130 controller. Using the latest firmware. > > I have put all the disk in jbod and running them in raidz2. After a while the systems hangs with arcmsr0: tran reset )level 0x1) called for target4 lun 0 > target reset not supported > > What can I do? I really want it to work! > > (I am gone set all the disk to pass-through monday) > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hello Arnaud, Thanks for your reply. We have a system ( 2 x Xeon 5410, Intel S5000PSL mobo and 8 GB memory) with 12 x 500 GB SATA disks on a Areca 1130 controller. rpool is a mirror over 2 disks. 8 disks in raidz2, 1 spare. We have 2 aggr links. Our goal is a ESX storage system, I am using ISCSI and NFS to serve space to our ESX 4.0 servers. We can remove a disk, with no problem. I can do a replace and the disk is being resilverd. That works fine here. Our problem comes when we make it the server a little bit harder! When we give the server a "hard" time, copy 60G+ of data or do some other stuff to give the system some load it hangs. This happens after 5 minutes or after 30 minutes or later but it hangs. Then we get the problems of the attached pictures. I have also emaild Areca. I''ll hope the can fix it.. Regards, Rob -- This message posted from opensolaris.org
Maybe Matching Threads
- OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card
- Areca 1100 SATA Raid Controller in JBOD mode Hangs on zfs root creation.
- Areca 1100 SATA Raid Controller in JBOD mode Hangs on zfs root creation
- People Centric recherche plusieurs développeurs Ruby on Rails
- Areca RAID controller on latest CentOS 7 (1708 i.e. RHEL 7.4) kernel 3.10.0-693.2.2.el7.x86_64