Hi guys, I''m currently running 2 zpools each in a raidz1 configuration, totally around 16TB usable data. I''m running it all on an OpenSolaris based box with 2gb memory and an old Athlon 64 3700 CPU, I understand this is very poor and underpowered for deduplication, so I''m looking at building a new system, but wanted some advice first, here is what i''ve planned so far: Core i7 2600 CPU 16gb DDR3 Memory 64GB SSD for ZIL (optional) Would this produce decent results for deduplication of 16TB worth of pools or would I need more RAM still? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110205/78362a6b/attachment-0001.html>
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Michael > > Core i7 2600 CPU > 16gb DDR3 Memory > 64GB SSD for ZIL (optional) > > Would this produce decent results for deduplication of 16TB worth of pools > or would I need more RAM still?What matters is the amount of unique data in your pool. I''ll just assume it''s all unique, but of course that''s ridiculous because if it''s all unique then why would you want to enable dedup. But anyway, I''m assuming 16T of unique data. The rule is a little less than 3G of ram for every 1T of unique data. In your case, 16*2.8 = 44.8G ram required in addition to your base ram configuration. You need at least 48G of ram. Or less unique data.
On 2/7/2011 1:06 PM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Michael >> >> Core i7 2600 CPU >> 16gb DDR3 Memory >> 64GB SSD for ZIL (optional) >> >> Would this produce decent results for deduplication of 16TB worth of pools >> or would I need more RAM still? > What matters is the amount of unique data in your pool. I''ll just assume > it''s all unique, but of course that''s ridiculous because if it''s all unique > then why would you want to enable dedup. But anyway, I''m assuming 16T of > unique data. > > The rule is a little less than 3G of ram for every 1T of unique data. In > your case, 16*2.8 = 44.8G ram required in addition to your base ram > configuration. You need at least 48G of ram. Or less unique data.To follow up on Ned''s estimation, please let us know what kind of data you''re planning on putting in the Dedup''d zpool. That can really give us a better idea as to the number of slabs that the pool will have, which is what drives dedup RAM and L2ARC usage. You also want to use an SSD for L2ARC, NOT for ZIL (though, you *might* also want one for ZIL, depending on your write patterns). In all honesty, these days, it doesn''t pay to dedup a pool unless you can count on large amounts of common data. Virtual Machine images, incremental backups, ISO images of data CD/DVDs, and some Video are your best bet. Pretty much everything else is going to cost you more in RAM/L2ARC than it''s worth. IMHO, you don''t want Dedup unless you can *count* on a 10x savings factor. Also, for reasons discussed here before, I would not recommend a Core i7 for use as a fileserver CPU. It''s an Intel Desktop CPU, and almost certainly won''t support ECC Ram on your motherboard, and it seriously overpowered for your use. See if you can find a nice socket AM3+ motherboard for a low-range Athlon X3/X4. You can get ECC RAM for it (even in a desktop motherboard), it will cost less, and perform at least as well. Dedup is not CPU intensive. Compression is, and you may very well want to enable that, but you''re still very unlikely to hit a CPU bottleneck before RAM starvation or disk wait occurs. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
On 2/7/2011 1:06 PM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Michael >> >> Core i7 2600 CPU >> 16gb DDR3 Memory >> 64GB SSD for ZIL (optional) >> >> Would this produce decent results for deduplication of 16TB worth of pools >> or would I need more RAM still? > What matters is the amount of unique data in your pool. I''ll just assume > it''s all unique, but of course that''s ridiculous because if it''s all unique > then why would you want to enable dedup. But anyway, I''m assuming 16T of > unique data. > > The rule is a little less than 3G of ram for every 1T of unique data. In > your case, 16*2.8 = 44.8G ram required in addition to your base ram > configuration. You need at least 48G of ram. Or less unique data.To follow up on Ned''s estimation, please let us know what kind of data you''re planning on putting in the Dedup''d zpool. That can really give us a better idea as to the number of slabs that the pool will have, which is what drives dedup RAM and L2ARC usage. You also want to use an SSD for L2ARC, NOT for ZIL (though, you *might* also want one for ZIL, depending on your write patterns). In all honesty, these days, it doesn''t pay to dedup a pool unless you can count on large amounts of common data. Virtual Machine images, incremental backups, ISO images of data CD/DVDs, and some Video are your best bet. Pretty much everything else is going to cost you more in RAM/L2ARC than it''s worth. IMHO, you don''t want Dedup unless you can *count* on a 10x savings factor. Also, for reasons discussed here before, I would not recommend a Core i7 for use as a fileserver CPU. It''s an Intel Desktop CPU, and almost certainly won''t support ECC Ram on your motherboard, and it seriously overpowered for your use. See if you can find a nice socket AM3+ motherboard for a low-range Athlon X3/X4. You can get ECC RAM for it (even in a desktop motherboard), it will cost less, and perform at least as well. Dedup is not CPU intensive. Compression is, and you may very well want to enable that, but you''re still very unlikely to hit a CPU bottleneck before RAM starvation or disk wait occurs. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
On 6 February 2011 01:34, Michael <michael.armstrong at gmail.com> wrote:> Hi guys, > > I''m currently running 2 zpools each in a raidz1 configuration, totally > around 16TB usable data. I''m running it all on an OpenSolaris based box with > 2gb memory and an old Athlon 64 3700 CPU, I understand this is very poor and > underpowered for deduplication, so I''m looking at building a new system, but > wanted some advice first, here is what i''ve planned so far: > > Core i7 2600 CPU > 16gb DDR3 Memory > 64GB SSD for ZIL (optional) >http://ark.intel.com/Product.aspx?id=52213 <http://ark.intel.com/Product.aspx?id=52213>The desktop Core i* range doesn''t support ECC ram at all, this could potentially be a pool breaker if you get a flipped bit in the wrong place (a significant metadata block). Just something to keep in mind. Also, Intel have issued a recall (ish) for all of the 6 series chipsets released so far, the PLL unit for the 3gbit SATA ports on the chipset is driven too hard and will likely degrade over time (5~15% failure rate over three years). They are talking about a March~April time to fix in the channel. If you don''t plan on using the 3gbit SATA ports, then you''re fine. Intel will make 1155 Xeon''s at some point, ie http://en.wikipedia.org/wiki/List_of_future_Intel_microprocessors#.22Sandy_Bridge.22_.2832_nm.29_8 They support ECC (just check for a specific QVL after launch, "DDR3 ECC" isn''t necessarily the only thing you need to look for). I think the Feb 20 release date may have been pushed for the chipset respin. Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110208/6ee94f2c/attachment-0001.html>