Hi, I hope there''s someone here who can possibly provide some assistance. I''ve had this read problem now for the past 2 months and just can''t get to the bottom of it. I have a home snv_111b server, with a zfs raid pool (4 x Samsung 750GB SATA drives). The motherboard is a ASUS M2N68-CM (4 SATA ports) with an Athlon LE1620 single core CPU and 4GB of RAM. I am using it to store home videos captured from a Windows client. I''ve just been ftping the avi files over 100Mb ethernet, so writes were limited to around 11MB/sec. The array is 89% full at the moment after copying over alot of files recently. In short, I''m having troubles with reads to the point where the pool is near hanging up completely, and in some cases it does and I have to restart the box. When I first built this box (initially was snv101) it was working ok, but I''ve not really attempted to do much reading from it, just been copying large avi files over (some up to 12-20GB each)....but now I need to read from it. I am finding 1 of the 4 devices seems to be the problem...iostat shows it crawling. But at times it will get a burst of life and read for a few seconds, then hang up again. I tried swapping the drives around on different SATA ports on the motherboard (exported pool, swapped SATA cables around, imported pool), but the problem remains with the same drive (ie: problem moves to the next SATA port, but same drive). dd''ing from the device though is no problems, as long as I''m not trying to read from the pool. When the pool is in it''s very slow/hung state, you can''t do anything with the devices. Here''s my devices: tim at opensolaris:~$ iostat -xnz extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.5 2.7 98.1 29.4 0.1 0.0 18.6 8.3 1 2 c0d0 3.9 0.6 220.9 57.9 0.0 0.0 5.6 1.2 0 0 c6d1 8.1 0.0 456.8 0.0 0.0 0.0 0.4 0.5 0 0 c3t0d0 12.8 0.0 717.7 0.0 5.5 0.2 433.1 14.6 18 19 c3t1d0 4.9 0.0 279.1 0.0 0.0 0.0 0.7 0.5 0 0 c3t2d0 4.9 0.0 279.0 0.0 0.0 0.0 0.9 0.6 0 0 c3t3d0 c0d0 is the rpool. c6d1 is a 1.5TB drive connected to a Sil3114 controller (backup1 pool) c3t0-c3t3 are the 4 x 750GB drives connected to the motherboard ports. As you can see from the iostat above, c3t1d0 is acting differently. zpool status shows no errors. I have not upgraded the ZFS pools since I moved from snv101 to snv111. I''ve booted up snv101 and the problem still exists. At first I thought it was in some way linked to the file size I was attempting to read/copy, but I get the problem with files under 1GB. This is a typical output I am seeing from iostat during a file copy: tim at opensolaris:~$ iostat -xnz 1 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 12.2 4.8 824.1 85.7 0.7 0.3 39.8 16.3 10 16 c0d0 0.2 0.1 12.8 0.4 0.0 0.0 2.8 8.7 0 0 c6d1 0.5 0.0 27.7 0.0 0.0 0.0 5.2 2.2 0 0 c3t0d0 0.2 0.0 12.0 0.0 0.5 0.1 2068.0 247.4 5 6 c3t1d0 0.5 0.0 32.2 0.0 0.0 0.0 4.5 2.6 0 0 c3t2d0 0.5 0.0 30.9 0.0 0.0 0.0 5.4 2.8 0 0 c3t3d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 21.0 1.0 0.0 0.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 92.0 0.0 5504.4 3.0 0.7 32.6 7.6 24 44 c6d1 1.0 0.0 85.5 0.0 21.5 1.0 21510.7 999.9 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 2.0 0.0 128.0 0.0 22.5 1.0 11234.0 500.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 42.5 0.0 24.2 1.0 24189.5 1000.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 25.0 1.0 0.0 0.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 25.0 1.0 0.0 0.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 128.0 0.0 25.9 1.0 25914.1 999.6 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 26.0 0.0 1791.6 0.0 0.1 0.0 3.6 0.6 1 2 c3t0d0 1.0 0.0 85.5 0.0 26.9 1.0 26908.0 1000.2 100 100 c3t1d0 26.0 0.0 1791.6 0.0 0.1 0.0 4.4 0.7 2 2 c3t2d0 25.0 0.0 1791.6 0.0 0.1 0.0 5.5 0.8 2 2 c3t3d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 27.0 1.0 0.0 0.0 100 100 c3t1d0 and then you get the odd bursts (but not from c3t1d0): extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 128.0 0.0 34.0 1.0 33999.6 1000.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 85.5 0.0 34.0 1.0 34003.1 1000.1 100 100 c3t1d0 1562.9 2.0 87312.9 0.0 0.0 0.7 0.0 0.5 0 75 c3t2d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 85.5 0.0 34.0 1.0 33997.3 999.9 100 100 c3t1d0 728.1 0.0 40771.2 0.0 0.0 0.3 0.0 0.4 0 33 c3t2d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 0.0 42.5 0.0 34.0 1.0 33999.6 1000.0 100 100 c3t1d0 during dd tests, the speed is great from all 4 devices (4th one though is a fraction slower, I think it''s different firmware revision): tim at opensolaris:/dev/dsk$ pfexec dd if=/dev/dsk/c3t0d0 of=/dev/null bs=128k count=10000 10000+0 records in 10000+0 records out 1310720000 bytes (1.3 GB) copied, 11.2845 s, 116 MB/s tim at opensolaris:/dev/dsk$ pfexec dd if=/dev/dsk/c3t1d0 of=/dev/null bs=128k count=10000 10000+0 records in 10000+0 records out 1310720000 bytes (1.3 GB) copied, 10.9697 s, 119 MB/s tim at opensolaris:/dev/dsk$ pfexec dd if=/dev/dsk/c3t2d0 of=/dev/null bs=128k count=10000 10000+0 records in 10000+0 records out 1310720000 bytes (1.3 GB) copied, 11.6693 s, 112 MB/s tim at opensolaris:/dev/dsk$ pfexec dd if=/dev/dsk/c3t3d0 of=/dev/null bs=128k count=10000 10000+0 records in 10000+0 records out 1310720000 bytes (1.3 GB) copied, 14.2785 s, 91.8 MB/s r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 2130.9 0.0 119329.2 0.0 0.0 0.9 0.0 0.4 1 92 c3t1d0 2005.1 0.0 112287.7 0.0 0.0 0.9 0.0 0.4 1 88 c3t2d0 2082.9 0.0 116642.0 0.0 0.0 0.9 0.0 0.4 1 93 c3t0d0 1628.0 0.0 91168.3 0.0 0.0 0.9 0.0 0.6 0 94 c3t3d0 I''ve searched and read many opensolaris threads, bugs etc, but I just can''t get to the bottom of this. To me it seems like a ZFS issue, but then it seems like a hardware issue as it''s always the same drive that seems to hang things up (but dd read tests from it are fine). I''d really appreciate if anyone has any ideas or things to try..my read throughput is around 300kb/sec, next to nothing. I''ve got all this data in the pool and I can''t access it. Cheers Tim -- This message posted from opensolaris.org
the closest bug I can find it this : 6772082 (ahci: ZFS hangs when IO happens) -- This message posted from opensolaris.org
On Wed, 16 Dec 2009, Tim wrote:> read for a few seconds, then hang up again. I tried swapping the > drives around on different SATA ports on the motherboard (exported > pool, swapped SATA cables around, imported pool), but the problem > remains with the same drive (ie: problem moves to the next SATA > port, but same drive).This should be a clue that the drive is to blame and that it should be replaced. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob that was my initial thought as well when I saw the problem stay with the drive after moving it to a different SATA port, but then it doesn''t explain why a dd test runs fine. I guess I could try a longer dd test. My dd test could have been just lucky and hit an ok part of disk. Would a scrub help or would that just complicate things if the drive is bad and the scrub then has problems ? I haven''t swapped out the disk as yet as I don''t have a spare, but I''m thinking I''ll have to go buy another one in order to do the swap (and send the potential faulty one back). The drives had only a few days use. Tim -- This message posted from opensolaris.org
On Wed, 16 Dec 2009, Tim wrote:> Bob that was my initial thought as well when I saw the problem stay > with the drive after moving it to a different SATA port, but then it > doesn''t explain why a dd test runs fine. I guess I could try a > longer dd test. My dd test could have been just lucky and hit an ok > part of disk.You would have to dd from the entire disk to be sure. There is also the possibility of an access pattern that the disk does not like.> Would a scrub help or would that just complicate things if the drive > is bad and the scrub then has problems ?It might help but if the disk drive is trying excessively hard to recover data from a bad spot, then the scrub might take quite a long time. If it gets past that spot then all might be well unless new bad spots form. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
I''ll dd the whole disk tonight. I was thinking it was bad spots, given how some files I can copy (admittedly they are small ones) better then others.... but in saying that, seeing the throughput at 349k/sec often is rather odd on different files. And then the files that manage to copy ok, the throughput is still stop/start all the way...the speed comes in bursts, then stops, then starts. So thinking bad sectors is the reason why I ran the Samsung diags on the drive (non-destructive) and it scanned the entire surface with no problems.. -- This message posted from opensolaris.org
On 12/16/09 07:31 PM, Tim wrote:> I''ll dd the whole disk tonight. I was thinking it was bad spots, > given how some files I can copy (admittedly they are small ones) > better then others.... but in saying that, seeing the throughput at > 349k/sec often is rather odd on different files. And then the files > that manage to copy ok, the throughput is still stop/start all the > way...the speed comes in bursts, then stops, then starts. So thinking > bad sectors is the reason why I ran the Samsung diags on the drive > (non-destructive) and it scanned the entire surface with no > problems..A naive question for the gurus - how about using format/analyze in non-destructive mode to look at suspect disks? It seems to me that it would be a bit harder on the disk than plain dd... Even if the disk wasn''t fully dedicated to Solaris, it would exercise the bit that ZFS is using. We routinely run format/ana on all new disks before putting them in service and I was wondering if that was a waste of time. Thanks -- Frank
I never formatted these drives when I built the box, I just added them to zfs. I can try format>analyze>read as well. -- This message posted from opensolaris.org
Bob Friesenhahn
2009-Dec-17 01:42 UTC
[zfs-discuss] Solaris 10 pre-fetch bug fix on the way
Some of you may recall my complaints in July about very poor Solaris 10 performance when re-reading large collections of medium-sized (5MB) files, and recall verifying the situation on your own systems (including OpenSolaris). I now have an IDR installed under Solaris 10U8 which includes a fix for this problem. I am told that the fix will likely appear within the next month or two as a Solaris 10 kernel patch. While first-read performance has not improved, the re-read performance improvement due to the fix is quite dramatic: U8 Baseline (Generic_141445-09) ============================== Doing initial (unmount/mount) ''cpio -C 131072 -o > /dev/null'' 144000768 blocks real 8m35.46s user 0m4.21s sys 1m16.30s Doing second ''cpio -C 131072 -o > /dev/null'' 144000768 blocks real 35m35.85s user 0m4.66s sys 1m25.63s U8 + IDR143158-03 ================ Doing initial (unmount/mount) ''cpio -C 131072 -o > /dev/null'' 144000768 blocks real 8m39.18s user 0m4.47s sys 1m26.90s Doing second ''cpio -C 131072 -o > /dev/null'' 144000768 blocks real 8m36.04s user 0m4.46s sys 1m19.58s Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Thomas Burgess
2009-Dec-17 02:45 UTC
[zfs-discuss] Solaris 10 pre-fetch bug fix on the way
this is great, i remember reading about this. Wow, it sure did take awhile huh? Glad they finally got it working for you. What exactly caused this bug? If i remember right it wasn''t just related to solaris. I remember seeing the same behavior in FreeBSD. On Wed, Dec 16, 2009 at 8:42 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> Some of you may recall my complaints in July about very poor Solaris 10 > performance when re-reading large collections of medium-sized (5MB) files, > and recall verifying the situation on your own systems (including > OpenSolaris). I now have an IDR installed under Solaris 10U8 which includes > a fix for this problem. I am told that the fix will likely appear within > the next month or two as a Solaris 10 kernel patch. > > While first-read performance has not improved, the re-read performance > improvement due to the fix is quite dramatic: > > U8 Baseline (Generic_141445-09) > ==============================> > Doing initial (unmount/mount) ''cpio -C 131072 -o > /dev/null'' > 144000768 blocks > > real 8m35.46s > user 0m4.21s > sys 1m16.30s > > Doing second ''cpio -C 131072 -o > /dev/null'' > 144000768 blocks > > real 35m35.85s > user 0m4.66s > sys 1m25.63s > > U8 + IDR143158-03 > ================> > Doing initial (unmount/mount) ''cpio -C 131072 -o > /dev/null'' > 144000768 blocks > > real 8m39.18s > user 0m4.47s > sys 1m26.90s > > Doing second ''cpio -C 131072 -o > /dev/null'' > 144000768 blocks > > real 8m36.04s > user 0m4.46s > sys 1m19.58s > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091216/4212982f/attachment.html>
Bob Friesenhahn
2009-Dec-17 03:25 UTC
[zfs-discuss] Solaris 10 pre-fetch bug fix on the way
On Wed, 16 Dec 2009, Thomas Burgess wrote:> this is great, i remember reading about this.? Wow, it sure did take awhile huh?? Glad they > finally got it working for you.? What exactly caused this bug?? If i remember right it wasn''t > just related to solaris.? I remember seeing the same behavior in FreeBSD.I have had some version of the IDR here for maybe a month but I had other things that I needed to attend to before I could test with it. The fix has been in development OpenSolaris for at least this long. An analysis of the bug was posted to zfs-discuss. It seemed complicated and difficult for mere-mortals to understand. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
hmm interesting... haven''t tried dd yet... just been running a read test via format>analyze and it''s showing up the slow down. Starts off reading fast...next time I looked at it, it was reading slowly and was up to 52100000. I started another read test on one of the other drives and in a few minutes it was reading past the sus one. I restarted the read test on the sus drive as I wanted to see exactly what count it started to read slow, got to count 51200000 and then it slows right down and becomes erratic. I haven''t let it run much longer then that. Interesting though...I dare say there''s no bad sectors, but something is amiss with the scanning. Anyone ever seen a drive behave like this before ? I thought the count being 512xxxxx a little odd too. Going to do some more tests. Tim -- This message posted from opensolaris.org
On Dec 16, 2009, at 9:17 PM, Tim wrote:> hmm interesting... haven''t tried dd yet... just been running a > read test via format>analyze and it''s showing up the slow down. > > Starts off reading fast...next time I looked at it, it was reading > slowly and was up to 52100000. I started another read test on one > of the other drives and in a few minutes it was reading past the sus > one. I restarted the read test on the sus drive as I wanted to see > exactly what count it started to read slow, got to count 51200000 > and then it slows right down and becomes erratic. I haven''t let it > run much longer then that. Interesting though...I dare say there''s > no bad sectors, but something is amiss with the scanning. Anyone > ever seen a drive behave like this before ? I thought the count > being 512xxxxx a little odd too.Try yelling at it :-) But seriously, vibration could cause this. -- richard
at exactly the same spot it slows down ? I''ve just run the test a number of times, and without fail at the exactly the same spot, the read will just crawl along and erratically. It''s at approx 51256xxx..... -- This message posted from opensolaris.org
I''m just doing a surface scan via the Samsung utility to see if I see the same slow down.. -- This message posted from opensolaris.org
hmm, not seeing the same slow down when I boot from the Samsung EStool CD and run a diag which performs a surface scan... could this still be a hardware issue, or possibly something with the Solaris data format on the disk? -- This message posted from opensolaris.org
On Wed, Dec 16 at 22:41, Tim wrote:>hmm, not seeing the same slow down when I boot from the Samsung EStool CD and run a diag which performs a surface scan... >could this still be a hardware issue, or possibly something with the Solaris data format on the disk?Rotating drives often have various optimizations to help recover from damaged servo sectors when reading sequentially, in that they can skip over bad areas and just "assume" that the position information is there, until they get an ECC fatal on a read. Until the drive wanders off-track, it just keeps reading until it eventually finds some position information. I''m guessing you have a physical problem with the servo wedges on that drive that only manifests itself in some of your access methods. Does the drive click or make any other noises when this is happening? For the price of drives today, I''d buy a replacement and look at swapping that one out. You can always keep it as a spare for later. -- Eric D. Mudama edmudama at mail.bounceswoosh.org
Tim, Use the fmdump -eV command to see what disk errors are reported through the fault management system and see what output iostat -En might provide. Cindy On 12/16/09 23:41, Tim wrote:> hmm, not seeing the same slow down when I boot from the Samsung EStool CD and run a diag which performs a surface scan... > could this still be a hardware issue, or possibly something with the Solaris data format on the disk?
fmdump shows errors on a different drive, and none on the one that has this slow read problem: Nov 27 2009 20:58:28.670057389 ereport.io.scsi.cmd.disk.recovered nvlist version: 0 class = ereport.io.scsi.cmd.disk.recovered ena = 0xbeb7f4dd5300001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev device-path = /pci at 0,0/pci1043,82b3 at 9/disk at 2,0 devid = id1,sd at SATA_____SAMSUNG_HD753LJ_______S1PWJ1CQ801987 (end detector) driver-assessment = recovered op-code = 0x28 cdb = 0x28 0x0 0x4 0x80 0x32 0x80 0x0 0x0 0x80 0x0 pkt-reason = 0x0 pkt-state = 0x1f pkt-stats = 0x50 __ttl = 0x1 __tod = 0x4b0fa2c4 0x27f043ad The serial number of the sus drive is S1PWJ1CQ801987. iostat -En shows: c0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Model: ST360021A Revision: Serial No: 3HR2AG72 Size: 60.02GB <60020932608 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 c6d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Model: SAMSUNG HD154UI Revision: Serial No: S1Y6J1KS720622 Size: 1500.30GB <1500295200768 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 c3t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: SAMSUNG HD753LJ Revision: 1113 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 28 Predictive Failure Analysis: 0 [b]c3t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: SAMSUNG HD753LJ Revision: 1113 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 49 Predictive Failure Analysis: 0 [/b] c3t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: SAMSUNG HD753LJ Revision: 1113 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 28 Predictive Failure Analysis: 0 c3t3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: SAMSUNG HD753LJ Revision: 1110 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 28 Predictive Failure Analysis: 0 c0t1d0 Soft Errors: 0 Hard Errors: 30 Transport Errors: 0 Vendor: ATAPI Product: CD-RW 52X24 Revision: F.JZ Serial No: Size: 0.00GB <0 bytes> Media Error: 0 Device Not Ready: 30 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 -- This message posted from opensolaris.org
I tried to buy another drive today (750GB or 1TB) to swap out c3t1d0 (750GB) but could not find one quickly. So I was thinking as a temporary measure to use my 1.5TB disk instead as it can be reused at the moment (is currently attached to a sil3114 controller - c6d1p0). Would it be ok to do a zpool replace with this 1.5TB disk attached to the sil3114 controller (it''s only a SATA1 controller, whereas motherboard ports are SATAII): "zpool replace storage c3t1d0 c6d1p0" or should I just physically disconnect the sus 750GB drive (c3t1d0) and plug the 1.5TB disk to that port and then run: "zpool replace storage c3t1d0" What''s the best approach ? And can you confirm the correct steps to take please. If it''s the latter option, do I need to do anything with taking the pool offline at all, or is it just a matter of shutting the box down, swap cables over, start it up and when it comes up degraded (assuming it will), I then run the zpool replace command. I''m feeling a little uneasy about the whole thing, as I have no other backup at the moment other then this array being RAIDZ...my 1.5TB disk was to be a 2nd copy/backup of most of the RAIDZ data (not the best backup, but at least something) but to date I can''t read well from the RAIDZ pool to do that backup..and it''s a few years of home video. I think I''d just fall to pieces if I lost it. I''d greatly appreciate the advice. Tim -- This message posted from opensolaris.org
Hi Tim, I looked up the sil3114 controller and I found this CR: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6813171 sil3114 sata controller not supported If you can see this disk with format, then I guess I''m less uneasy, but due to the hardware support issue, you might try to create a test pool first like this to see if this disk is usable: # zpool create test c6d1 The "c6d1p0" device is the fdisk partition so just use "c6d1" instead. Then, write and read some data to this pool. If this works, then destroy test and use zpool replace to replace the troublesome disk in your main pool as you identified: # zpool replace storage c3t1d0 c6d1 I don''t think ZFS cares if the devices are SATA1 and SATAII. I wonder if anyone else is using this controller/disk and can comment. Thanks, Cindy On 12/18/09 02:34, Tim wrote:> I tried to buy another drive today (750GB or 1TB) to swap out c3t1d0 (750GB) but could not find one quickly. So I was thinking as a temporary measure to use my 1.5TB disk instead as it can be reused at the moment (is currently attached to a sil3114 controller - c6d1p0). > > Would it be ok to do a zpool replace with this 1.5TB disk attached to the sil3114 controller (it''s only a SATA1 controller, whereas motherboard ports are SATAII): > > "zpool replace storage c3t1d0 c6d1p0" > > or > > should I just physically disconnect the sus 750GB drive (c3t1d0) and plug the 1.5TB disk to that port and then run: > > "zpool replace storage c3t1d0" > > What''s the best approach ? And can you confirm the correct steps to take please. If it''s the latter option, do I need to do anything with taking the pool offline at all, or is it just a matter of shutting the box down, swap cables over, start it up and when it comes up degraded (assuming it will), I then run the zpool replace command. > > I''m feeling a little uneasy about the whole thing, as I have no other backup at the moment other then this array being RAIDZ...my 1.5TB disk was to be a 2nd copy/backup of most of the RAIDZ data (not the best backup, but at least something) but to date I can''t read well from the RAIDZ pool to do that backup..and it''s a few years of home video. I think I''d just fall to pieces if I lost it. > > I''d greatly appreciate the advice. > Tim
Hi Cindy, I had similar concerns however I wasn''t aware of that bug. Before I bought this controller I had read a number of people saying that they had problems and then other people saying didn''t have problems with the sil3114. I was originally after a sil3124 (SATAII) but given my future drives didn''t need the extra speed I settled on the cheaper sil3114. Sil3124 was 5 times the cost of a sil3114. A friend was running a sil3112 (2 port SATAI card) and that appeared to be fine. So I bought the sil3114. As far as I''ve seen, the card is fine. I have already connected and created a pool on the 1.5TB disk attached to the sil3114 using snv111. I think from memory I even booted back to snv101 and it still recognised it as well. It''s after I started copying files to this new pool that I found my read problem on the main ''storage'' pool. That was my plan for a backup device, just a single drive to start in it''s own pool of the sil3114, and then I could add to it as needed, hence the sil3114 & 1.5TB disk. I''m fairly sure though when I created the new pool ''backup1'' that I used the device c6d1p0, not p1. I''ll try c6d1p1 today to make sure that is ok. Is there a problem using c6d1p0 ? Do you or anyone else know when a disk is replaced via : zpool replace pool_name old_disk new_disk when does the old_disk actually get removed from the pool ? is it before the new_disk starts it''s resilver, or is it after the new_disk has been resilvered ? Once I get the drives swapped in the array, I was going to reformat the sus 750GB and give it another workout and see if the read slow down persists, before sending it back to Samsung. The drive is 14 months old, but it''s had probably 2 weeks of total use. Cindy, thanks for the reply, I really appreciate it. Tim -- This message posted from opensolaris.org
Hi Tim, The p* devices represent the larger container Solaris fdisk container, so a possibly scenario is that someone could create a pool that contains both a p0 container, which might also point to the same blocks as another partition in that container that is also included in the pool. This would be bad. I think its a bug that you can create a pool on a p* device because we are unsure that all operations are supported on p* devices. We don''t test pool operations on p* devices. Cindy On 12/18/09 14:09, Tim wrote:> Hi Cindy, > > I had similar concerns however I wasn''t aware of that bug. Before I bought this controller I had read a number of people saying that they had problems and then other people saying didn''t have problems with the sil3114. I was originally after a sil3124 (SATAII) but given my future drives didn''t need the extra speed I settled on the cheaper sil3114. Sil3124 was 5 times the cost of a sil3114. A friend was running a sil3112 (2 port SATAI card) and that appeared to be fine. So I bought the sil3114. > > As far as I''ve seen, the card is fine. I have already connected and created a pool on the 1.5TB disk attached to the sil3114 using snv111. I think from memory I even booted back to snv101 and it still recognised it as well. It''s after I started copying files to this new pool that I found my read problem on the main ''storage'' pool. That was my plan for a backup device, just a single drive to start in it''s own pool of the sil3114, and then I could add to it as needed, hence the sil3114 & 1.5TB disk. > > I''m fairly sure though when I created the new pool ''backup1'' that I used the device c6d1p0, not p1. I''ll try c6d1p1 today to make sure that is ok. Is there a problem using c6d1p0 ? > > Do you or anyone else know when a disk is replaced via : > > zpool replace pool_name old_disk new_disk > > when does the old_disk actually get removed from the pool ? is it before the new_disk starts it''s resilver, or is it after the new_disk has been resilvered ? > > Once I get the drives swapped in the array, I was going to reformat the sus 750GB and give it another workout and see if the read slow down persists, before sending it back to Samsung. The drive is 14 months old, but it''s had probably 2 weeks of total use. > > Cindy, thanks for the reply, I really appreciate it. > Tim
there''s actually no device c6d1 in /dev/dsk, only: tim at opensolaris:/dev/dsk$ ls -l c6d1* lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1p0 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:q lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1p1 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:r lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1p2 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:s lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1p3 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:t lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1p4 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:u lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s0 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:a lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s1 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:b lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s10 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:k lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s11 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:l lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s12 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:m lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s13 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:n lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s14 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:o lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s15 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:p lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s2 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:c lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s3 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:d lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s4 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:e lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s5 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:f lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s6 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:g lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s7 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:h lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s8 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:i lrwxrwxrwx 1 root root 62 2009-10-27 18:03 c6d1s9 -> ../../devices/pci at 0,0/pci10de,561 at 8/pci-ide at 6/ide at 0/cmdk at 1,0:j -- This message posted from opensolaris.org
should I use slice 2 instead of p0: Part Tag Flag Cylinders Size Blocks 0 unassigned wm 0 0 (0/0/0) 0 1 unassigned wm 0 0 (0/0/0) 0 2 backup wu 0 - 60796 1.36TB (60797/0/0) 2930111415 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 8 boot wu 0 - 0 23.53MB (1/0/0) 48195 9 alternates wm 1 - 2 47.07MB (2/0/0) 96390 or should I create a proper device c6d1 ? -- This message posted from opensolaris.org
I had referred to this blog entry: http://blogs.sun.com/observatory/entry/which_disk_devices_to_use -- This message posted from opensolaris.org
hmm ok, the replace with the existing drive still in place wasn''t the best option...it''s replacing, but very slowly as it''s reading from that sus disk: pool: storage state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h3m, 0.00% done, 2895h58m to go config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 replacing ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c6d1p0 ONLINE 0 0 0 13.3M resilvered c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 tim at opensolaris:~$ iostat -xnz 1 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 12.2 5.4 845.4 105.5 0.8 0.3 45.7 14.4 9 15 c0d0 0.2 8.9 1.8 36.2 0.0 0.0 4.4 1.1 0 1 c6d1 1.5 1.1 89.7 3.4 0.1 0.0 24.0 4.7 1 1 c3t0d0 1.5 1.1 89.6 3.4 5.2 0.5 1986.6 196.2 51 52 c3t1d0 1.6 1.3 92.1 3.2 0.1 0.0 19.0 3.8 1 1 c3t2d0 1.5 1.2 89.0 2.8 0.1 0.0 21.3 4.4 1 1 c3t3d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 13.0 1.0 0.0 0.0 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 2.0 0.0 85.5 0.0 0.0 0.0 1.1 0 0 c6d1 1.0 0.0 85.5 0.0 12.8 1.0 12838.7 999.7 100 100 c3t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 12.0 1.0 0.0 0.0 100 100 c3t1d0 can I stop the resilver or maybe shutdown the box to disconnect the dead disk ? is it ok to reboot the box while it''s doing a resilver ? ie: will it continue when it''s back up ? -- This message posted from opensolaris.org
slow and steady wins the race ? I ended up doing a zpool remove of c6d1p0. This stopped the replace and it removed c6d1p0, and left the array doing a scrub, which was going to take by rough calculations around 12 months and increasing ! So I shut the box down, disconnected the SATA cable from c3t1d0 and restarted the box....it then hung on startup and wouldn''t mount any ZFS volumes and no disk activity.... So I shut the box down again, reconnected the SATA cable to c3t1d0 and restarted the box. It came up (phew). zpool status showed the scrub was running again: scrub: scrub in progress for 0h3m, 0.00% done, 8769h2m to go config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 So this time I offlined c3t1d0: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c3t0d0 ONLINE 0 0 0 c3t1d0 OFFLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 then I ran the zpool replace and now it''s doing the replace: scrub: resilver in progress for 0h15m, 4.10% done, 6h7m to go config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c3t0d0 ONLINE 0 0 0 replacing DEGRADED 0 0 570K c3t1d0 OFFLINE 0 0 0 c6d1p0 ONLINE 0 0 0 23.7G resilvered c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 I don''t care if it takes a day, as long as it works :o) Thanks so far for the advice, I''ll let you know how it goes. -- This message posted from opensolaris.org