I suspect that something is wrong with one of my disks. This is the output from iostat: extended device statistics ---- errors --- r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 2.0 18.9 38.1 160.9 0.0 0.1 0.1 3.2 0 6 0 0 0 0 c5d0 2.7 18.8 59.3 160.9 0.0 0.1 0.2 3.2 0 6 0 0 0 0 c5d1 0.0 36.8 1.1 3593.7 0.0 0.1 0.0 2.9 0 8 0 0 0 0 c6t66d0 0.0 38.2 0.0 3693.7 0.0 0.2 0.0 4.6 0 12 0 0 0 0 c6t70d0 0.0 38.1 0.0 3693.7 0.0 0.1 0.0 2.4 0 5 0 0 0 0 c6t74d0 0.0 42.0 0.0 4155.4 0.0 0.0 0.0 0.6 0 2 0 0 0 0 c6t76d0 0.0 36.9 0.0 3593.7 0.0 0.1 0.0 1.4 0 3 0 0 0 0 c6t78d0 0.0 41.7 0.0 4155.4 0.0 0.0 0.0 1.2 0 4 0 0 0 0 c6t80d0 The disk in question is c6t70d0 - it shows consistently higher %b and asvc_t than the other disks in the pool. The output is from a ''zfs receive'' after about 3 hours. The two c5dx disks are the ''rpool'' mirror, the others belong to the ''backup'' pool. admin at master:~# zpool status pool: backup state: ONLINE scan: scrub repaired 0 in 5h7m with 0 errors on Tue Jan 31 04:55:31 2012 config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c6t78d0 ONLINE 0 0 0 c6t66d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c6t70d0 ONLINE 0 0 0 c6t74d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c6t76d0 ONLINE 0 0 0 c6t80d0 ONLINE 0 0 0 errors: No known data errors admin at master:~# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 4.53T 1.37T 3.16T 30% 1.00x ONLINE - admin at master:~# uname -a SunOS master 5.11 oi_148 i86pc i386 i86pc Should I be worried? And what other commands can I use to investigate further?
On Wed, 1 Feb 2012, Jan Hellevik wrote:> The disk in question is c6t70d0 - it shows consistently higher %b and asvc_t > than the other disks in the pool. The output is from a ''zfs receive'' after about 3 hours. > The two c5dx disks are the ''rpool'' mirror, the others belong to the ''backup'' pool.Are all of the disks the same make and model? What type of chassis are the disks mounted in? Is it possible that the environment that this disk experiences is somehow different than the others (e.g. due to vibration)?> Should I be worried? And what other commands can I use to investigate further?It is difficult to say if you should be worried. Be sure to do ''iostat -xe'' to see if there are any accumulating errors related to the disk. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Hi! On Feb 1, 2012, at 7:43 PM, Bob Friesenhahn wrote:> On Wed, 1 Feb 2012, Jan Hellevik wrote: >> The disk in question is c6t70d0 - it shows consistently higher %b and asvc_t >> than the other disks in the pool. The output is from a ''zfs receive'' after about 3 hours. >> The two c5dx disks are the ''rpool'' mirror, the others belong to the ''backup'' pool. > > Are all of the disks the same make and model? What type of chassis are the disks mounted in? Is it possible that the environment that this disk experiences is somehow different than the others (e.g. due to vibration)?They are different makes - I try to make pairs of different brands to minimise risk. The disks are in a Rackable Systems enclosure (disk shelf?). 16 disks, all SATA. Connected to a SASUC8I controller on the server. This is a backup server I recently put together to keep backups from my main server. I put in the disks from the old ''backup'' pool and have started a 2TB zfs send/receive from my main server. So far thing look ok, it is just the somewhat high values on that one disk that worries me a little.> >> Should I be worried? And what other commands can I use to investigate further? > > It is difficult to say if you should be worried. > > Be sure to do ''iostat -xe'' to see if there are any accumulating errors related to the disk. >This is the most current output from iostat. It has been running a zfs receive for more than a day. No errors. zpool status also reports no errors. extended device statistics ---- errors --- r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 8.1 18.7 142.5 180.4 0.0 0.1 0.1 3.2 0 8 0 0 0 0 c5d0 10.2 18.7 186.3 180.4 0.0 0.1 0.1 3.3 0 9 0 0 0 0 c5d1 0.0 36.7 0.0 3595.8 0.0 0.1 0.0 3.2 0 9 0 0 0 0 c6t66d0 0.0 36.0 0.0 3642.2 0.0 0.1 0.0 3.9 0 12 0 0 0 0 c6t70d0 0.0 36.1 0.0 3642.2 0.0 0.1 0.0 2.9 0 5 0 0 0 0 c6t74d0 0.0 39.6 0.0 4071.8 0.0 0.0 0.0 0.7 0 2 0 0 0 0 c6t76d0 0.2 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c6t77d0 0.2 36.8 0.3 3595.8 0.0 0.1 0.0 1.9 0 4 0 0 0 0 c6t78d0 0.2 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c6t79d0 0.2 39.6 0.3 4071.6 0.0 0.1 0.0 1.6 0 5 0 0 0 0 c6t80d0 0.2 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c6t81d0 admin at master:/export/home/admin$ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 4.53T 2.17T 2.36T 47% 1.00x ONLINE - admin at master:/export/home/admin$ zpool status pool: backup state: ONLINE scan: scrub repaired 0 in 5h7m with 0 errors on Tue Jan 31 04:55:31 2012 config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c6t78d0 ONLINE 0 0 0 c6t66d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c6t70d0 ONLINE 0 0 0 c6t74d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c6t76d0 ONLINE 0 0 0 c6t80d0 ONLINE 0 0 0 errors: No known data errors
Hi Jan, These commands will tell you if FMA faults are logged: # fmdump # fmadm faulty This command will tell you if errors are accumulating on this disk: # fmdump -eV | more Thanks, Cindy On 02/01/12 11:20, Jan Hellevik wrote:> I suspect that something is wrong with one of my disks. > > This is the output from iostat: > > extended device statistics ---- errors --- > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device > 2.0 18.9 38.1 160.9 0.0 0.1 0.1 3.2 0 6 0 0 0 0 c5d0 > 2.7 18.8 59.3 160.9 0.0 0.1 0.2 3.2 0 6 0 0 0 0 c5d1 > 0.0 36.8 1.1 3593.7 0.0 0.1 0.0 2.9 0 8 0 0 0 0 c6t66d0 > 0.0 38.2 0.0 3693.7 0.0 0.2 0.0 4.6 0 12 0 0 0 0 c6t70d0 > 0.0 38.1 0.0 3693.7 0.0 0.1 0.0 2.4 0 5 0 0 0 0 c6t74d0 > 0.0 42.0 0.0 4155.4 0.0 0.0 0.0 0.6 0 2 0 0 0 0 c6t76d0 > 0.0 36.9 0.0 3593.7 0.0 0.1 0.0 1.4 0 3 0 0 0 0 c6t78d0 > 0.0 41.7 0.0 4155.4 0.0 0.0 0.0 1.2 0 4 0 0 0 0 c6t80d0 > > The disk in question is c6t70d0 - it shows consistently higher %b and asvc_t > than the other disks in the pool. The output is from a ''zfs receive'' after about 3 hours. > The two c5dx disks are the ''rpool'' mirror, the others belong to the ''backup'' pool. > > admin at master:~# zpool status > pool: backup > state: ONLINE > scan: scrub repaired 0 in 5h7m with 0 errors on Tue Jan 31 04:55:31 2012 > config: > > NAME STATE READ WRITE CKSUM > backup ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > c6t78d0 ONLINE 0 0 0 > c6t66d0 ONLINE 0 0 0 > mirror-1 ONLINE 0 0 0 > c6t70d0 ONLINE 0 0 0 > c6t74d0 ONLINE 0 0 0 > mirror-2 ONLINE 0 0 0 > c6t76d0 ONLINE 0 0 0 > c6t80d0 ONLINE 0 0 0 > > errors: No known data errors > > admin at master:~# zpool list > NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT > backup 4.53T 1.37T 3.16T 30% 1.00x ONLINE - > > admin at master:~# uname -a > SunOS master 5.11 oi_148 i86pc i386 i86pc > > Should I be worried? And what other commands can I use to investigate further? > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, 1 Feb 2012, Jan Hellevik wrote:>> >> Are all of the disks the same make and model? > > They are different makes - I try to make pairs of different brands to minimise risk.Does your pairing maintain the same pattern of disk type across all the pairings? Some modern disks use 4k sectors while others still use 512 bytes. If the slow disk is a 4k sector model but the others are 512 byte models, then that would certainly explain a difference. Assuming that a couple of your disks are still unused, you could try replacing the suspect drive with an unused drive (via zfs command) to see if the slowness goes away. You could also make that vdev a triple-mirror since it is very easy to add/remove drives from a mirror vdev. Just make sure that your zfs syntax is correct so that you don''t accidentally add a single-drive vdev to the pool (oops!). These sorts of things can be tested with zfs commands without physically moving/removing drives or endangering your data. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Feb 1, 2012, at 8:07 PM, Bob Friesenhahn wrote:> On Wed, 1 Feb 2012, Jan Hellevik wrote: >>> >>> Are all of the disks the same make and model? >> >> They are different makes - I try to make pairs of different brands to minimise risk. > > Does your pairing maintain the same pattern of disk type across all the pairings? >Not 100% percent sure I understand what you mean (english is not my first language). These are the disks: mirror-0: wd15ears + hd154ui mirror-1: wd15ears + hd154ui mirror-2: wd20ears + hd204ui Two pairs of 1.5TB and one pair of 2.0TB. I would like to have pairs of the same size, but these were the disks I had available, and since it is a backup pool I do not think it matters that much. If the flooding hadn''t tripled the price of disks I would probably buy a few more, but not with the current price level. :-( I am waiting for a replacement 1.5TB disk and will replace the ''bad'' one as soon as I get it.> Some modern disks use 4k sectors while others still use 512 bytes. If the slow disk is a 4k sector model but the others are 512 byte models, then that would certainly explain a difference. >AVAILABLE DISK SELECTIONS: 0. c5d0 <?????xH?????????????0?0"??? cyl 14590 alt 2 hd 255 sec 63> 1. c5d1 <?????xH?????????????0?0"??? cyl 14590 alt 2 hd 255 sec 63> 2. c6t66d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB> 3. c6t67d0 <ATA-SAMSUNG HD501LJ-0-12-465.76GB> 4. c6t68d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> 5. c6t69d0 <ATA-SAMSUNG HD501LJ-0-12-465.76GB> 6. c6t70d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB> 7. c6t71d0 <ATA-SAMSUNG HD501LJ-0-13-465.76GB> 8. c6t72d0 <ATA -WDC WD6400AAKS--3B01 cyl 38909 alt 2 hd 255 sec 126> 9. c6t73d0 <ATA-SAMSUNG HD501LJ-0-13-465.76GB> 10. c6t74d0 <ATA-SAMSUNG HD154UI-1118-1.36TB> 11. c6t75d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> 12. c6t76d0 <ATA-SAMSUNG HD204UI-0001-1.82TB> 13. c6t77d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> 14. c6t78d0 <ATA-SAMSUNG HD154UI-1118-1.36TB> 15. c6t79d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> 16. c6t80d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB> 17. c6t81d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB> mirror-0 2. c6t66d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB> 14. c6t78d0 <ATA-SAMSUNG HD154UI-1118-1.36TB> mirror-1 6. c6t70d0 <ATA-WDC WD15EARS-00Z-0A80-1.36TB> 10. c6t74d0 <ATA-SAMSUNG HD154UI-1118-1.36TB> mirror-2 12. c6t76d0 <ATA-SAMSUNG HD204UI-0001-1.82TB> 16. c6t80d0 <ATA-WDC WD20EARS-00M-AB51-1.82TB> You can see that mirror-0 and mirror-1 have identical disk pairs. BTW: Can someone explain why this: 8. c6t72d0 <ATA -WDC WD6400AAKS--3B01 cyl 38909 alt 2 hd 255 sec 126> is not shown the same way as this: 4. c6t68d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> Why the cylinder/sector in line 8?> Assuming that a couple of your disks are still unused, you could try replacing the suspect drive with an unused drive (via zfs command) to see if the slowness goes away. You could also make that vdev a triple-mirror since it is very easy to add/remove drives from a mirror vdev. Just make sure that your zfs syntax is correct so that you don''t accidentally add a single-drive vdev to the pool (oops!). These sorts of things can be tested with zfs commands without physically moving/removing drives or endangering your data. >If I had available disks, I would. As of now, the are all busy. :-) Thanks for the advice!> Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Hello Jan> BTW: Can someone explain why this: > 8. c6t72d0 <ATA -WDC WD6400AAKS--3B01 cyl 38909 alt 2 hd 255 sec 126> > is not shown the same way as this: > 4. c6t68d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> > > Why the cylinder/sector in line 8?As I know this is depending on the Format Label you have SMI or EFI what does the prtvtoc shows you? S0013(root)#~> prtvtoc /dev/dsk/<diskname>s2 * /dev/dsk/<diskname>s2 partition map * * Dimensions: * 512 bytes/sector * 2097152 sectors * 2097085 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First Sector Last * Sector Count Sector * 34 222 255 * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 4 00 256 2080479 2080734 8 11 00 2080735 16384 2097118 <<<< indicates EFI Label S0013(root)#~> prtvtoc /dev/dsk/<diskname>s2 * /dev/dsk/c1t0d0s2 (volume "ROOTDISK") partition map * * Dimensions: * 512 bytes/sector * 255 sectors/track * 16 tracks/cylinder * 4080 sectors/cylinder * 38309 cylinders * 38307 accessible cylinders * * Flags: * 1: unmountable * 10: read-only * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 2 00 0 156292560 156292559 2 5 00 0 156292560 156292559 <<<<<< indicates SMI Label 19. c0t<diskname>d0 <SUN-SOLARIS-1-1.00GB> 24. c1t<diskname>d0 <DEFAULT cyl 38307 alt 2 hd 16 sec 255> ROOTDISK Regards Christian
Hi! You were right. It turns out that the disks were not part of a pool yet. One of them had previously been used in a pool in another machine, but one of them had been used somewhere else (Ubuntu or OS X), and that explains it. After I put them to use in a pool, ''format'' show what I expected: 4. c6t68d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 44,0 8. c6t72d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> /pci at 0,0/pci1022,9603 at 2/pci1000,3140 at 0/sd at 48,0 Thank you for the explanation! On Feb 3, 2012, at 12:02 PM, Christian Meier wrote:> Hello Jan, > > I''m not sure I you saw my answer, because I answered to the mailing List > >> > BTW: Can someone explain why this: >> > 8. c6t72d0 <ATA -WDC WD6400AAKS--3B01 cyl 38909 alt 2 hd 255 sec 126> >> > is not shown the same way as this: >> > 4. c6t68d0 <ATA-WDC WD6400AAKS-2-3B01-596.17GB> >> > >> > Why the cylinder/sector in line 8? > As I know this is depending on the Format Label you have > SMI or EFI > > what does the prtvtoc shows you? > > S0013(root)#~> prtvtoc /dev/dsk/<diskname>s2 > * /dev/dsk/<diskname>s2 partition map > * > * Dimensions: > * 512 bytes/sector > * 2097152 sectors > * 2097085 accessible sectors > * > * Flags: > * 1: unmountable > * 10: read-only > * > * Unallocated space: > * First Sector Last > * Sector Count Sector > * 34 222 255 > * > * First Sector Last > * Partition Tag Flags Sector Count Sector Mount Directory > 0 4 00 256 2080479 2080734 > 8 11 00 2080735 16384 2097118 <<<< indicates EFI > Label > > S0013(root)#~> prtvtoc /dev/dsk/<diskname>s2 > * /dev/dsk/c1t0d0s2 (volume "ROOTDISK") partition map > * > * Dimensions: > * 512 bytes/sector > * 255 sectors/track > * 16 tracks/cylinder > * 4080 sectors/cylinder > * 38309 cylinders > * 38307 accessible cylinders > * > * Flags: > * 1: unmountable > * 10: read-only > * > * First Sector Last > * Partition Tag Flags Sector Count Sector Mount Directory > 0 2 00 0 156292560 156292559 > 2 5 00 0 156292560 156292559 <<<<<< indicates > SMI Label > > > 19. c0t<diskname>d0 <SUN-SOLARIS-1-1.00GB> > 24. c1t<diskname>d0 <DEFAULT cyl 38307 alt 2 hd 16 sec 255> ROOTDISK > > > Regards Christian > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120205/e669353e/attachment-0001.html>