Koopmann, Jan-Peter
2012-Jun-17 15:11 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Hi, my oi151 based home NAS is approaching a frightening "drive space" level. Right now the data volume is a 4*1TB Raid-Z1, 3 1/2" local disks individually connected to an 8 port LSI 6Gbit controller. So I can either exchange the disks one by one with autoexpand, use 2-4 TB disks and be happy. This was my original approach. However I am totally unclear about the 512b vs 4Kb issue. What sata disk could I use that is big enough and still uses 512b? I know about the discussion about the upgrade from a 512b based pool to a 4 KB pool but I fail to see a conclusion. Will the autoexpand mechanism upgrade ashift? And what disks do not lie? Is the performance impact significant? So I started to think about option 2. That would be using an external JOBD chassis (4-8 disks) and eSATA. But I would either need a JBOD with 4-8 eSATA connectors (which I am yet to find) or use a JBOD with a "good" expander. I see several cheap sata to esata jbod chassis making use of "port multiplier". Is this referring to a expander backplane and will work with oi, LSI and mpt or mpt_sas? I am aware that this is not the most performant solution but this is a home nas storing tons of pictures and videos only. And I could use the internal disks for backup purposes. Any suggestion for components are greatly appreciated. And before you ask: Currently I have 3TB net. 6 TB net would be the minimum target. 9TB sounds nicer. So if you have 512b HD recommendations with 2/3TB each or a good JBOD suggestion, please let me know! Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120617/f2966ba5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6355 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120617/f2966ba5/attachment.bin>
Timothy Coalson
2012-Jun-17 20:19 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> So I can either exchange the disks one by one with autoexpand, use 2-4 TB > disks and be happy. This was my original approach. However I am totally > unclear about the 512b vs 4Kb issue. What sata disk could I use that is big > enough and still uses 512b? I know about the discussion about the upgrade > from a 512b based pool to a 4 KB pool but I fail to see a conclusion. Will > the autoexpand mechanism upgrade ashift? And what disks do not lie? Is the > performance impact significant?Replacing devices will not change the ashift, it is set permanently when a vdev is created, and zpool will refuse to replace a device in an ashift=9 vdev with a device that it would use ashift=12 on. Large Western Digital disks tend to say they have 4k sectors, and hence cannot be used to replace your current disks, while hitachi and seagate offer 512 emulated disks, which should allow you to replace your current disks without needing to copy the contents of the pool to a new one. If you don''t have serious performance requirements, you may not notice the impact of emulated 512 sectors (especially since zfs buffers async writes into transaction groups). I did some rudimentary testing on a large pool of hitachi 3TB 512 emulated disks with ashift=9 vs ashift=12 with bonnie, and it didn''t seem to matter a whole lot (though its possibly relevant tests were large writes, which have little penalty, and character at a time, which was bottlenecked by the cpu since the test was single threaded, so it didn''t test the worst case). The worst case for 512 emulated sectors on zfs is probably small (4KB or so) synchronous writes (which if they mattered to you, you would probably have a separate log device, in which case the data disk write penalty may not matter).> So I started to think about option 2. That would be using an external JOBD > chassis (4-8 disks) and eSATA. But I would either need a JBOD with 4-8 eSATA > connectors (which I am yet to find) or use a JBOD with a "good" expander. I > see several cheap sata to esata jbod chassis making use of "port > multiplier". Is this referring to a expander backplane and will work with > oi, LSI and mpt or mpt_sas?I''m wondering, based on the comment about routing 4 eSATA cables, what kind of options your NAS case has, if your LSI controller has SFF-8087 connectors (or possibly even if it doesn''t), you might be able to use an adapter to the SFF-8088 external 4 lane SAS connector, which may increase your options. It seems that support for SATA port multiplier is not mandatory in a controller, so you will want to check with LSI before trying it (I would hope they support it on SAS controllers, since I think it is a vastly simplified version of SAS expanders). Tim
2012-06-17 19:11, Koopmann, Jan-Peter ?????:> Hi, > > my oi151 based home NAS is approaching a frightening "drive space" > level. Right now the data volume is a 4*1TB Raid-Z1, 3 1/2" local disks > individually connected to an 8 port LSI 6Gbit controller. > > So I can either exchange the disks one by one with autoexpand, use 2-4 > TB disks and be happy. This was my original approach. However I am > totally unclear about the 512b vs 4Kb issue. What sata disk could I use > that is big enough and still uses 512b? I know about the discussion > about the upgrade from a 512b based pool to a 4 KB pool but I fail to > see a conclusion. Will the autoexpand mechanism upgrade ashift? And what > disks do not lie? Is the performance impact significant?AFAIK the Hitachi Desk/Ultra-Star (5K3000, 7K3000) should be 512b native, maybe the only ones at this size. Larger 4TB Hitachi models are 4KB native, 512e emulated - according to datasheets on site. HTH, //Jim Klimov
Daniel Carosone
2012-Jun-17 21:41 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Sun, Jun 17, 2012 at 03:19:18PM -0500, Timothy Coalson wrote:> Replacing devices will not change the ashift, it is set permanently > when a vdev is created, and zpool will refuse to replace a device in > an ashift=9 vdev with a device that it would use ashift=12 on.Yep.> [..] while hitachi and seagate offer 512 emulated disks> I did some rudimentary testing on a large pool of hitachi 3TB 512 emulated > disks with ashift=9 vs ashift=12 with bonnie, and it didn''t seem to matter a > whole lotHitachi are native 512-byte sectors. At least, the 5k3000 and 7k3000 are, in the 2T and 3T sizes. I haven''t noticed if they have a newer model which is 4k native. How long that continues to remain the case, and how long these models continue to remain available (e.g. for replacements) is entirely another matter. The replacement applies even to under-warranty cases; I know someone who recently had a 4k-only drive supplied as a warranty replacement for a 512 native drive (not, in this case, from Hitachi). As for performance, at least in my experience with WD disks emulating 512-byte sectors, you *will* notice the difference; heavy metadata updates being the most obvious impact. The conclusion is that unless your environment is well controlled, the time has probably come where new general-purpose pools should be made at ashift=12, to allow future flexibility.> I''m wondering, based on the comment about routing 4 eSATA cables, what > kind of options your NAS case has, if your LSI controller has SFF-8087 > connectors (or possibly even if it doesn''t), you might be able to use > an adapter to the SFF-8088 external 4 lane SAS connector, which may > increase your options. It seems that support for SATA port multiplier > is not mandatory in a controller, so you will want to check with LSI > before trying it (I would hope they support it on SAS controllers, > since I think it is a vastly simplified version of SAS expanders).SATA port-multipliers and SAS expanders are not related in any sense of common driver support; they''re similar only in general concept. Do not conflate them. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/67a9d482/attachment-0001.bin>
Koopmann, Jan-Peter
2012-Jun-17 22:21 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Hi Tim, thanks to you and the others for answering.> worst case). The worst case for 512 emulated sectors on zfs is > probably small (4KB or so) synchronous writes (which if they mattered > to you, you would probably have a separate log device, in which case > the data disk write penalty may not matter).Good to know. This really opens up the possibility of buying 3 or 4TB Hitachi drives. At least the 4TB Hitachi drives are 4k (512b emulated) drives according to the latest news.> I''m wondering, based on the comment about routing 4 eSATA cables, what > kind of options your NAS case has, if your LSI controller has SFF-8087 > connectors (or possibly even if it doesn''t),It has actually.> you might be able to use > an adapter to the SFF-8088 external 4 lane SAS connector, which may > increase your options.So what you are saying is that something like this will do the trick? http://www.pc-pitstop.com/sata_enclosures/scsat44xb.asp If I interpret this correctly I get a SFF-8087 to SFF-8088 bracket, connect the 4 port LSI SFF-8077 to that bracket, then get a cable for this JBOD and throw in 4 drives? This would leave me with four additional HDDs without any SAS expander hassle. I had not come across these JBODs. Thanks a million for the hint. Do we agree that for a home NAS box a Hitachi Deskstar (not explicitly being a server SATA drive) will suffice despite potential TLER problems? I was thinking about Hitachi Deskstar 5k3000 drives. The 4TB seemingly came out but are rather expensive in comparison? Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/2ca2bf7d/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6443 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/2ca2bf7d/attachment.bin>
Carson Gaspar
2012-Jun-17 23:57 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On 6/17/12 3:21 PM, Koopmann, Jan-Peter wrote:> Hi Tim,> you might be able to use > an adapter to the SFF-8088 external 4 lane SAS connector, which may > increase your options. > > > So what you are saying is that something like this will do the trick? > > http://www.pc-pitstop.com/sata_enclosures/scsat44xb.asp > > If I interpret this correctly I get a SFF-8087 to SFF-8088 bracket, > connect the 4 port LSI SFF-8077 to that bracket, then get a cable for > this JBOD and throw in 4 drives? This would leave me with four > additional HDDs without any SAS expander hassle. I had not come across > these JBODs. Thanks a million for the hint.I have 2 Sans Digital TR8X JBOD enclosures, and they work very well. They also make a 4-bay TR4X. http://www.sansdigital.com/towerraid/tr4xb.html http://www.sansdigital.com/towerraid/tr8xb.html They cost a bit more than the one you linked to, but the drives are hot swap. They also make similar cases with port multipliers, RAID, etc., but I''ve only used the JBOD. -- Carson
Timothy Coalson
2012-Jun-18 01:36 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> worst case).??The worst case for 512 emulated sectors on zfs is > probably small (4KB or so) synchronous writes (which if they mattered > to you, you would probably have a separate log device, in which case > the data disk write penalty may not matter). > > > Good to know. This really opens up the possibility of buying 3 or 4TB > Hitachi drives. At least the 4TB Hitachi drives are 4k (512b emulated) > drives according to the latest news.It appears from the specs listed on the hitachi site that the drives I have may actually be 512 native, in which case my testing was moot. This does explain some other things I saw testing the drives in question, so I will assume they are 512 native, and that my testing was meaningless. If you copy folders containing thousands of small files frequently, the performance impact may be relevant, if you go for the 512 emulated drives.> So what you are saying is that something like this will do the trick? > > http://www.pc-pitstop.com/sata_enclosures/scsat44xb.asp > > If I interpret this correctly I get a SFF-8087 to SFF-8088 bracket, connect > the 4 port LSI SFF-8077 to that bracket, then get a cable for this JBOD and > throw in 4 drives? This would leave me with four additional HDDs without any > SAS expander hassle. I had not come across these JBODs. Thanks a million for > the hint.No problem, and yes, I think that should work. One thing to keep in mind, though, is that if the internals of the enclosure simply split the multilane SAS cable into 4 connectors without an expander, and you use SATA drives, the controller will use SATA mode, which as I understand it runs at a lower signalling voltage, and won''t work over long cables, so get a short cable (1 meter, shorter if you can find one). It looks like all of the ones mentioned so far use this method, though it would be good to know if Carson populated his with SATA drives.> Do we agree that for a home NAS box a Hitachi Deskstar (not explicitly being > a server SATA drive) will suffice despite potential TLER problems? I was > thinking about Hitachi Deskstar 5k3000 drives. The 4TB seemingly came out > but are rather expensive in comparison?I''m not sure what ZFS''s timeout for dropping an unresponsive disk is, or what it does when it responds again, so I don''t know if TLER would help. I have not had any serious problems with my pool of hitachi 3TB 5400 drives. Two different drives had a checksum error, once each, but stayed online in the pool. Tim
Carson Gaspar
2012-Jun-18 01:55 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On 6/17/12 6:36 PM, Timothy Coalson wrote:> No problem, and yes, I think that should work. One thing to keep in > mind, though, is that if the internals of the enclosure simply split > the multilane SAS cable into 4 connectors without an expander, and you > use SATA drives, the controller will use SATA mode, which as I > understand it runs at a lower signalling voltage, and won''t work over > long cables, so get a short cable (1 meter, shorter if you can find > one). It looks like all of the ones mentioned so far use this method, > though it would be good to know if Carson populated his with SATA > drives.SATA drives using 1m cables from an LSI SAS9201-16e -- Carson
Koopmann, Jan-Peter
2012-Jun-18 07:19 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Hi Carson,> > I have 2 Sans Digital TR8X JBOD enclosures, and they work very well. > They also make a 4-bay TR4X. > > http://www.sansdigital.com/towerraid/tr4xb.html > http://www.sansdigital.com/towerraid/tr8xb.htmllooks nice! The only thing coming to mind is that according to the specifications the enclosure is 3Gbits "only". If I choose to put in a SSD with 6Gbits this would be not optimal. I looked at their site but failed to find 6GBit enclosures. But I will keep looking since sooner or later they will provide it. I think I will go for the option of replacing the four drives for now with the Hitachi 3TB drives. This will give me 9TB net with RAID-Z1 level. I will calculate how expensive a 8bay enclosure with a LSI 8port external controller will be. Just in case the 9TB are not sufficient, I need a backup place or I decide to go for RAID-Z2. :-)> > They cost a bit more than the one you linked to, but the drives are hot > swap. They also make similar cases with port multipliers, RAID, etc., > but I''ve only used the JBOD. >I will bookmark them. Then enclosures do look nice. Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/c6176959/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6443 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/c6176959/attachment.bin>
Fajar A. Nugraha
2012-Jun-18 07:41 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Mon, Jun 18, 2012 at 2:19 PM, Koopmann, Jan-Peter <jan-peter at koopmann.eu> wrote:> Hi Carson, > > > I have 2 Sans Digital TR8X JBOD enclosures, and they work very well. > They also make a 4-bay TR4X. > > http://www.sansdigital.com/towerraid/tr4xb.html > http://www.sansdigital.com/towerraid/tr8xb.html > > > looks nice! The only thing coming to mind is that according to the > specifications the enclosure is 3Gbits "only".You mean http://www.sansdigital.com/towerraid-plus/index.php ? -- Fajar
Bob Friesenhahn
2012-Jun-18 13:39 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Mon, 18 Jun 2012, Koopmann, Jan-Peter wrote:> > looks nice! The only thing coming to mind is that according to the specifications the enclosure is 3Gbits "only". If I choose > to put in a SSD with 6Gbits this would be not optimal. I looked at their site but failed to find 6GBit enclosures. But I will > keep looking since sooner or later they will provide it.?I browsed the site and saw many 6GBit enclosures. I also saw one with Nexenta (Solaris/zfs appliance) inside. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Koopmann, Jan-Peter
2012-Jun-18 14:32 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Hi Bob,> On Mon, 18 Jun 2012, Koopmann, Jan-Peter wrote: >> >> looks nice! The only thing coming to mind is that according to the >> specifications the enclosure is 3Gbits "only". If I choose >> to put in a SSD with 6Gbits this would be not optimal. I looked at their >> site but failed to find 6GBit enclosures. But I will >> keep looking since sooner or later they will provide it. > > I browsed the site and saw many 6GBit enclosures. I also saw one with > Nexenta (Solaris/zfs appliance) inside.I found several high end enclosures. Or ones with bundled RAID cards. But the equivalent of the one originally suggested I was not able to find. However after looking at tons of sites for hours I might simply have missed it. If you found one, can you please forward a link? Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/e9a58bc6/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6443 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120618/e9a58bc6/attachment.bin>
Bob Friesenhahn
2012-Jun-18 14:56 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Mon, 18 Jun 2012, Koopmann, Jan-Peter wrote:> > I browsed the site and saw many 6GBit enclosures.??I also saw one with > Nexenta (Solaris/zfs appliance) inside. > > I found several high end enclosures. Or ones with bundled RAID cards. But the equivalent of the one originally > suggested I was not able to find. However after looking at tons of sites for hours I might simply have missed it. If > you found one, can you please forward a link?So you want high-end performance at a low-end price? It seems unlikely that you will notice the difference between 3Gbit or 6Gbit for a "home" application. FLASH-based SSDs seem to burn-out pretty quickly if you don''t use them carefully. The situation is getting worse rather than better over time as FLASH geometries get smaller and they try to store more bits in one cell. What was described as a bright new future is starting to look more like an end of the road to me. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Carson Gaspar
2012-Jun-18 22:59 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On 6/18/12 12:19 AM, Koopmann, Jan-Peter wrote:> Hi Carson, > > > I have 2 Sans Digital TR8X JBOD enclosures, and they work very well. > They also make a 4-bay TR4X. > > http://www.sansdigital.com/towerraid/tr4xb.html > http://www.sansdigital.com/towerraid/tr8xb.html > > > looks nice! The only thing coming to mind is that according to the > specifications the enclosure is 3Gbits "only". If I choose to put in a > SSD with 6Gbits this would be not optimal. I looked at their site but > failed to find 6GBit enclosures. But I will keep looking since sooner or > later they will provide it.The JBOD enclosures are completely passive. I can''t imagine any reason they wouldn''t support 6Gbit SATA/SAS - there are no electronics in them, just wire routing. -- Carson
Koopmann, Jan-Peter
2012-Jun-18 23:07 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Thanks. Just noticed that the Hitachi 3TB drives are not available. The 4TB ones are but with 512b emulated only. However I can get Barracudas 7200.14 with supposedly real 4k quite cheap. Anyone any experience with those? I might be getting one or two more and go for z2 instead of z1. I even found affordable passive enclosures available in Germany for very little money... The overall plan then would really be to switch to external JBODs and use the existing drives for backup only.. Kind regards, JP Am 19.06.2012 um 01:02 schrieb "Carson Gaspar" <carson at taltos.org>:> On 6/18/12 12:19 AM, Koopmann, Jan-Peter wrote: >> Hi Carson, >> >> >> I have 2 Sans Digital TR8X JBOD enclosures, and they work very well. >> They also make a 4-bay TR4X. >> >> http://www.sansdigital.com/towerraid/tr4xb.html >> http://www.sansdigital.com/towerraid/tr8xb.html >> >> >> looks nice! The only thing coming to mind is that according to the >> specifications the enclosure is 3Gbits "only". If I choose to put in a >> SSD with 6Gbits this would be not optimal. I looked at their site but >> failed to find 6GBit enclosures. But I will keep looking since sooner or >> later they will provide it. > > The JBOD enclosures are completely passive. I can''t imagine any reason > they wouldn''t support 6Gbit SATA/SAS - there are no electronics in them, > just wire routing. > > -- > Carson > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120619/f4f98702/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6355 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120619/f4f98702/attachment-0001.bin>
Carson Gaspar
2012-Jun-18 23:26 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On 6/18/12 4:07 PM, Koopmann, Jan-Peter wrote:> Thanks. Just noticed that the Hitachi 3TB drives are not available. The > 4TB ones are but with 512b emulated only. However I can get Barracudas > 7200.14 with supposedly real 4k quite cheap. Anyone any experience with > those? I might be getting one or two more and go for z2 instead of z1.What makes you think the Barracuda 7200.14 drives report 4k sectors? I gave up looking for 4kn drives, as everything I could find was 512e. I would _love_ to be wrong, as I have 8 4TB Hitachis on backorder that I would gladly replace with 4kn drives, even if I had to drop to 3TB density. From page 11 of http://www.seagate.com/staticfiles/support/docs/manual/desktop/Barracuda%207200.14/100686584c.pdf Formatted capacity (512 bytes/sector)** Bytes per sector (4K physical emulated at 512-byte sectors) From the smartmontools list, http://www.mail-archive.com/smartmontools-database at lists.sourceforge.net/msg00537.html Sector Sizes: 512 bytes logical, 4096 bytes physical -- Carson
Timothy Coalson
2012-Jun-19 00:24 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> What makes you think the Barracuda 7200.14 drives report 4k sectors? I gave > up looking for 4kn drives, as everything I could find was 512e. I would > _love_ to be wrong, as I have 8 4TB Hitachis on backorder that I would > gladly replace with 4kn drives, even if I had to drop to 3TB density.I have a western digital drive with 4k physical sectors that causes OpenIndiana to use ashift=12 (and refuse to use it as a replacement in an ashift=9 vdev), though it appears to report 512 logical sectors, so somehow OpenIndiana noticed the real sector size. It is model WD30EURS (a green "AV-GP" drive). Is it close enough to what you want for solaris to know that it is really 4k, even if it is a 512e drive? Some smartctl info: Device Model: WDC WD30EURS-63R8UY0 ... Firmware Version: 80.00A80 ... Sector Sizes: 512 bytes logical, 4096 bytes physical hdparm -I (on linux) also figures out that the drive has 4k physical sectors. Tim
On Mon, Jun 18, 2012 at 5:26 PM, Carson Gaspar <carson at taltos.org> wrote:> What makes you think the Barracuda 7200.14 drives report 4k sectors? I gave > up looking for 4kn drives, as everything I could find was 512e. I would > _love_ to be wrong, as I have 8 4TB Hitachis on backorder that I would > gladly replace with 4kn drives, even if I had to drop to 3TB density. > > From page 11 of > http://www.seagate.com/staticfiles/support/docs/manual/desktop/Barracuda%207200.14/100686584c.pdf > > Formatted capacity (512 bytes/sector)** > Bytes per sector (4K physical emulated at 512-byte sectors) > > From the smartmontools list, > http://www.mail-archive.com/smartmontools-database at lists.sourceforge.net/msg00537.html > > Sector Sizes: ? ? 512 bytes logical, 4096 bytes physical >Hmm, interesting. Like Timothy noted. My $work is also running into the same problem the Seagate 7200.14 where an ashift=9 pool would not accept the 4k drive. If you check the archives for a "Advanced Format HDD''s - are we there yet?" [1] thread from the end of May we discuss this same topic. [1] http://mail.opensolaris.org/pipermail/zfs-discuss/2012-May/051559.html
Bob Friesenhahn
2012-Jun-19 01:40 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Mon, 18 Jun 2012, Carson Gaspar wrote:> > What makes you think the Barracuda 7200.14 drives report 4k sectors? I gave > up looking for 4kn drives, as everything I could find was 512e. I would > _love_ to be wrong, as I have 8 4TB Hitachis on backorder that I would gladly > replace with 4kn drives, even if I had to drop to 3TB density.Why would you want native 4k drives right now? Not much would work with such drives. Maybe in a dedicated chassis (e.g. the JBOD) they could be of some use. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Koopmann, Jan-Peter
2012-Jun-19 05:21 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> > What makes you think the Barracuda 7200.14 drives report 4k sectors?http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg48912.html Nigel stated this here a few days ago. I did not check for myself. Maybe Nigel can comment on this? As for the question "why do you want 4k drives": My thinking - I will buy 4-6 disks now. - I must assume that during the next 3-4 years one of them might fail - Will I be able to buy a replacement in 3-3 years that reports the disk in such a way, that resilvering will work? According to the "Advanced Format" threat this seems to be a problem. I was hopimg to get arond this with these disks and have a more future proof solution Moreover: - If I buy new disks and a new JBOD etc. I might as well get a performant solution. In other threats ashift 9 vs 12 is presented as a problem. - Disk alignment: I am currently using whole disks AFAIK. But I do not remember. Did I use slicing etc? Is my alignment correct (btw. How do check?) So I thought: If I start over with a new pool O might get ot right and this seemed easier with those disks... Might be totally wrong withmmy assumptions and if so: Hey that''s the reason for asking you, knowing I am not the expert myself. :-) Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120619/03049c99/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6355 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120619/03049c99/attachment-0001.bin>
Timothy Coalson
2012-Jun-19 20:32 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> - Will I be able to buy a replacement in 3-3 years that reports the disk in > such a way, that resilvering will work? According to the "Advanced Format" > threat this seems to be a problem. I was hopimg to get arond this with these > disks and have a more future proof solutionI think that if you are running an illumos kernel, you can use /kernel/drv/sd.conf and tell it that the physical sectors for a disk model are 4k, despite what the disk says (and whether they really are). So, if you want an ashift=12 pool on disks that report 512 sectors, you should be able to do it now without a patched version of zpool.> - Disk alignment: I am currently using whole disks AFAIK. But I do not > remember. Did I use slicing etc? Is my alignment correct (btw. How do > check?) So I thought: If I start over with a new pool O might get ot right > and this seemed easier with those disks...Whole disk method is generally recommended, and should align if it gets the sector size right, the only time I have manually sliced was to overprovision an SSD. I think prtvtoc is all you need to determine if it is aligned, if the bytes/sector value under Dimensions is the true (physical) sector size, it should be aligned (if it reports 512 when it has 4k sectors, then in theory if "First sector" is a multiple of 8, it is aligned, but it will probably issue writes of size 512 which will degrade performance anyway). Tim
Koopmann, Jan-Peter
2012-Jun-20 04:20 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
Hi Timothy,> > I think that if you are running an illumos kernel, you can use > /kernel/drv/sd.conf and tell it that the physical sectors for a disk > model are 4k, despite what the disk says (and whether they really > are). So, if you want an ashift=12 pool on disks that report 512 > sectors, you should be able to do it now without a patched version of > zpool.That refers to creating a new pool and is good to know. However I was more afraid about the comments in the "Advanced Format" threat stating that if you have an ashift=9 512b based pool and need to replace a drive, resilver might fail if you put in a 4K disk. Assuming that in 2-3 years you might not be able to get 512b disks in the size you need them anymore, this could be a serious problem.> I think prtvtoc is all you need to determine > if it is aligned, if the bytes/sector value under Dimensions is the > true (physical) sector size, it should be aligned (if it reports 512 > when it has 4k sectors, then in theory if "First sector" is a multiple > of 8, it is aligned, but it will probably issue writes of size 512 > which will degrade performance anyway).Thanks. I will note that down and check once the drives arrive. Kind regards, JP -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120620/a3668221/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6443 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120620/a3668221/attachment.bin>
Timothy Coalson
2012-Jun-20 04:46 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
> I think that if you are running an illumos kernel, you can use > /kernel/drv/sd.conf > > That refers to creating a new pool and is good to know.Two things: one, it looks like you should also be able to trick it into using 512 sectors on a 4k disk, allowing you to do exactly such a replacement (incurring a similar write penalty to undetected 512 emulated sectors, but at least your pool won''t be degraded, though regular scrubs may be more important), and two, a caveat: the current OpenIndiana oi_151a4 release doesn''t seem to have a new enough version of illumos to support this yet, at least the man page for sd doesn''t mention the needed tunable. Tim
2012-06-20 0:32, Timothy Coalson wrote:> when it has 4k sectors, then in theory if "First sector" is a multiple > of 8, it is aligned, but it will probably issue writes of size 512 > which will degrade performance anyway).I think this is dependent on the firmware (vendor), and queued writes into the same HW sector (4Kb) which came in as a series of 512 byte blocks should be recombined in HDD cache and written in one stroke. Thus it should not hurt performance - if write-caching is enabled, at least, and/or NCQ/TCQ support. That has its caveats - starting with bug-less-ness of TCQ/NCQ/RMW and caching, and on into power-failure "support" regarding caches (i.e. have an UPS or capacitors big enough to flush the disk''s caches safely before parking the heads). Also by default if you don''t give the whole drive to ZFS, its cache may be disabled upon pool import and you may have to reenable it manually (if you only actively use this disk for one or more ZFS pools - which play with caching nicely). HTH, //Jim Klimov
Richard Elling
2012-Jun-20 21:58 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Jun 20, 2012, at 4:08 AM, Jim Klimov wrote:> > Also by default if you don''t give the whole drive to ZFS, its cache > may be disabled upon pool import and you may have to reenable it > manually (if you only actively use this disk for one or more ZFS > pools - which play with caching nicely).This is not correct. The behaviour is to attempt to enable the disk''s write cache if ZFS has the whole disk. Relevant code: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/vdev_disk.c#319 Please help us to stop propagating the misinformation that ZFS disables write caches. -- richard -- ZFS and performance consulting http://www.RichardElling.com
2012-06-21 1:58, Richard Elling wrote:> On Jun 20, 2012, at 4:08 AM, Jim Klimov wrote: >> >> Also by default if you don''t give the whole drive to ZFS, its cache >> may be disabled upon pool import and you may have to reenable it> The behaviour is to attempt to enable the disk''s write cache if ZFS has the > whole disk. Relevant code: > http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/vdev_disk.c#319 > > Please help us to stop propagating the misinformation that ZFS disables > write caches. > -- richardI see, sorry. So, the possible states are: 1) Before pool import, disk cache was disabled; then pool is imported: 1a) If ZFS has whole disk (how is that defined BTW, since partitions and slices are really used? Is the presence of a slice#7 which is 16384 sector long the trigger?) - then cache is enabled; 1b) ZFS does not have whole disk - cache is neither enabled nor disabled; 2) Before import disk cache was enabled; after import: no change regardless of "whole-diskness". Is this correct? How does a disk become "cache disabled" then - only manually? Or due to UFS usage? Or does it inherit HW setting? Or somehow else? I think the cache is enabled in the OS by default... Thanks, //Jim Klimov
Richard Elling
2012-Jun-21 01:39 UTC
[zfs-discuss] Recommendation for home NAS external JBOD
On Jun 20, 2012, at 5:08 PM, Jim Klimov wrote:> 2012-06-21 1:58, Richard Elling wrote: >> On Jun 20, 2012, at 4:08 AM, Jim Klimov wrote: >>> >>> Also by default if you don''t give the whole drive to ZFS, its cache >>> may be disabled upon pool import and you may have to reenable it > >> The behaviour is to attempt to enable the disk''s write cache if ZFS has the >> whole disk. Relevant code: >> http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/vdev_disk.c#319 >> >> Please help us to stop propagating the misinformation that ZFS disables >> write caches. >> -- richard > > I see, sorry. So, the possible states are: > > 1) Before pool import, disk cache was disabled; then pool is imported: > 1a) If ZFS has whole disk (how is that defined BTW, since partitions > and slices are really used? Is the presence of a slice#7 which > is 16384 sector long the trigger?) - then cache is enabled;by the command use: zpool create c0t0d0 ==> whole disk zpool create c0t0d0s0 ==> not whole disk> 1b) ZFS does not have whole disk - cache is neither enabled nor > disabled; > > 2) Before import disk cache was enabled; after import: no change > regardless of "whole-diskness".correct> > Is this correct? > > How does a disk become "cache disabled" then - only manually? > Or due to UFS usage? Or does it inherit HW setting? Or somehow else?For Sun, it was done by setting the disk firmware.> I think the cache is enabled in the OS by default?In general, illumos does not touch the cache. I don''t know of a way to set the cache policy in most BIOSes. In some cases, you can set it using format(1m), but whether it remains set after power-off depends on the drive manufacturer. Bottom line: don''t worry about it. -- richard -- ZFS and performance consulting http://www.RichardElling.com