Hello, What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU is optimal when using RAID-Z with a large number of disks (8)? Does L2 cache size play a big role, 256kb vs 512kb vs 1MB? Are there any performance improvements when using a dual core or quad processor machine? I am choosing a CPU in a system primarily for ZFS and am wondering whether paying the extra price for a larger cache or going dual core will provide any benefits. Or would it be better to put the money towards a higher clocked CPU? This message posted from opensolaris.org
Siegfried Nikolaivich wrote:>Hello, > >What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU is optimal when using RAID-Z with a large number of disks (8)? > >My experience is that for hardware that will be used in a server orientated role, there are a lot of considerations that need to be taken into account in addition to MHz and cache sizes. But for ZFS, it has been said often that it currently performs much better with a 64bit address space, such as that with Opterons and other AMD64 CPUs. I think this would play a bigger part in a ZFS server performing well than just MHZ and cache size. Darren
>Hello, > >What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU is optimal when using RAID-Z with a large number of disks (8)? > >Does L2 cache size play a big role, 256kb vs 512kb vs 1MB? Are there any performance improvementswhen using a dual core or quad processor machine? The most important factor when selecting a CPU for ZFS is that is /must/ be a 64 bit CPU. It''s very virtual memory hungry. (It works on 32 bit systems, but there are many limitations) Casper
Casper; Does this mean it would be a good practice to say increase the amount of memory and/or swap space we usually recommend if the customer intends to use ZFS very heavily? Sorry if this is a dumb question....! Warmest Regards Steven Sim Casper.Dik at Sun.COM wrote:>> Hello, >> >> What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU is optimal when using RAID-Z with a large number of disks (8)? >> >> Does L2 cache size play a big role, 256kb vs 512kb vs 1MB? Are there any performance improvements >> > when using a dual core or quad processor machine? > > > The most important factor when selecting a CPU for ZFS is that is /must/ > be a 64 bit CPU. It''s very virtual memory hungry. (It works on 32 bit > systems, but there are many limitations) > > Casper > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > >Fujitsu Asia Pte. Ltd. _____________________________________________________ This e-mail is confidential and may also be privileged. If you are not the intended recipient, please notify us immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. Opinions, conclusions and other information in this message that do not relate to the official business of my firm shall be understood as neither given nor endorsed by it.
Steven Sim wrote:> Casper; > > Does this mean it would be a good practice to say increase the amount of > memory and/or swap space we usually recommend if the customer intends to > use ZFS very heavily?ZFS doesn''t necessarily use more memory (physical or virtual) than UFS it needs more VM *address space* (not the same as more VM) hence the 64 bit processor. -- Darren J Moffat
>Casper; > >Does this mean it would be a good practice to say increase the amount of >memory and/or swap space we usually recommend if the customer intends to >use ZFS very heavily?Memory is always good; but it is *virtual* memory (address space) which matters most. The 32 bit kernel only has a 1GB or so of address space (it''s shared with userland); the 64 bit kernel has something closely resembling infinity Casper
On Thu, 2006-07-06 at 01:57 -0700, Siegfried Nikolaivich wrote:> What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU > is optimal when using RAID-Z with a large number of disks (8)?An additional point here: to an extent this depends on what you''re going to be using the system for. I''ve got an old 733Mhz PentiumIII machine running the latest Nevada build, which happily runs ZFS -- it gets used as a simple backup via rsync server once a month or so, and happily manages ~180gb of data. cheers, tim -- Tim Foster, Sun Microsystems Inc, Operating Platforms Group Engineering Operations http://blogs.sun.com/timf
Darren J Moffat writes: > Steven Sim wrote: > > Casper; > > > > Does this mean it would be a good practice to say increase the amount of > > memory and/or swap space we usually recommend if the customer intends to > > use ZFS very heavily? > > ZFS doesn''t necessarily use more memory (physical or virtual) than UFS > it needs more VM *address space* (not the same as more VM) hence the 64 > bit processor. > > -- > Darren J Moffat > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss I concur and add 2 things: there is somewhat of a bug today in which ZFS allows application to dirty too much memory before being throttled. I mention this because, it''s the issue that has created the notion the ZFS needs more ram; it does not. With better application throttling in place, this urban legend will debunk itself. Next, I''m not VM expert, but since ZFS does reference cached data in the kernel, I do think that it''s a best practice to configure some extra swap to account for these larger kernels. -r
On Thu, 6 Jul 2006, Siegfried Nikolaivich wrote: [ ... reformatted ...]> Hello, > > What kind of x86 CPU does ZFS prefer? In particular, what kind of CPU > is optimal when using RAID-Z with a large number of disks (8)?BTW: I''ve read the existing followups (all good stuff!). 64-bit AMD> Does L2 cache size play a big role, 256kb vs 512kb vs 1MB? Are there > any performance improvements when using a dual core or quad processor > machine?Solaris code, in general, is designed to take advantage of extra L2 processor cache. As an aside, if you''re looking at a 939-pin AMD CPU, grab one with 1Mb of cache (per core) quickly. The rumor is that large cache parts will become unavailable in the near future as AMD can squeeze more small cache CPUs per wafer - and bring them to market to counter the upcoming price war with Intel. PS: A big cut in AMD prices is rumored to be less than one month away - but, by then, the 1Mb cache parts may (or may not ??) be history.> I am choosing a CPU in a system primarily for ZFS and am wondering > whether paying the extra price for a larger cache or going dual core > will provide any benefits. Or would it be better to put the money > towards a higher clocked CPU?Solaris "likes" a system with 2 or more processors (or cores). It makes for a very responsive system. Recommendation: X2 4400+ for 939-pin single socket system will provide good long-term value. PS: Many had really good luck over-clocking the Model 165 Opteron (939-pin) dual-core part - it is a very easy/conservative overclock. Unfortunately, it is no longer available from AMD. You might try to EBay one. PPS: You may have noticed that the newer AM2 parts are mostly 512kb cache. Rumor is that parts with 256kb cache (ouch!) will be more easily available soon. Email me offlist if you have any further questions. Regards, Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006
Darren J Moffat wrote:> Steven Sim wrote: > >> Casper; >> >> Does this mean it would be a good practice to say increase the amount >> of memory and/or swap space we usually recommend if the customer >> intends to use ZFS very heavily? > > > ZFS doesn''t necessarily use more memory (physical or virtual) than UFS > it needs more VM *address space* (not the same as more VM) hence the > 64 bit processor. >offtopic query : How can ZFS require more VM address space but not more VM ? thanks. Pramod
Hi; I''ve just went through the following URL http://blogs.sun.com/roller/page/roch?entry=the_dynamics_of_zfs For those interested, I got to the above URL from http://www.solarisinternals.com/wiki/index.php/Solaris_Internals_and_Performance_FAQ Under the section ("DOES ZFS REALLY USE MORE RAM ?"), he clearly elaborated ZFS''s memory requirements. For the offtopic query "How can ZFS require more VM address space but not more VM ?", my simple explanation (which may be wrong) would be That Solaris VM immediately "reserves" swap space upon memory request. This takes place even though no actual physical page (and also VM page?) is yet assigned to the malloc call. It was made very clear in the first Edition of Solaris Internals that Solaris avoids the "lazy" method of memory assignment used by AIX and Linux (which may result in things like OOM described clearly in http://lwn.net/Articles/104179/ Am I right? Or completely off course? Warmest Regards Steven Sim Pramod Batni wrote:> Darren J Moffat wrote: > >> Steven Sim wrote: >> >>> Casper; >>> >>> Does this mean it would be a good practice to say increase the >>> amount of memory and/or swap space we usually recommend if the >>> customer intends to use ZFS very heavily? >> >> >> ZFS doesn''t necessarily use more memory (physical or virtual) than >> UFS it needs more VM *address space* (not the same as more VM) hence >> the 64 bit processor. >> > offtopic query : > How can ZFS require more VM address space but not more VM ? > > thanks. > Pramod > > > >Fujitsu Asia Pte. Ltd. _____________________________________________________ This e-mail is confidential and may also be privileged. If you are not the intended recipient, please notify us immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. Opinions, conclusions and other information in this message that do not relate to the official business of my firm shall be understood as neither given nor endorsed by it.
> But for ZFS, it has been said often that it currently performs > much better with a 64bit address space, such as that with > Opterons and other AMD64 CPUs. I think this would play a > bigger part in a ZFS server performing well than just MHZ > and cache size.I will no doubt be selecting a 64-bit capable CPU. My main concern is whether getting a dual core vs single core processor will give ZFS any noticable performance gain. Is ZFS multi-threaded in any way? I will also be heavily using NFS and possibly Samba, but a single core processor with a much higher clock speed is much cheaper than the dual core offerings from AMD. Also, there is a premium price for extra L2 cache. Would the ZFS checksum''ing and parity calculations benefit at all from a larger L2 cache, say 1MB? Or would the instructions fit fine inside 512kB? I know it depends on the application, but some general info on this subject will help my selection. Thanks This message posted from opensolaris.org
>Darren J Moffat wrote: > >> Steven Sim wrote: >> >>> Casper; >>> >>> Does this mean it would be a good practice to say increase the amount >>> of memory and/or swap space we usually recommend if the customer >>> intends to use ZFS very heavily? >> >> >> ZFS doesn''t necessarily use more memory (physical or virtual) than UFS >> it needs more VM *address space* (not the same as more VM) hence the >> 64 bit processor. >> > offtopic query : > How can ZFS require more VM address space but not more VM ?You mean, not more physical memory? Casper
On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote:> > offtopic query : > How can ZFS require more VM address space but not more VM ? >The real problem is VA fragmentation, not consumption. Over time, ZFS''s heavy use of the VM system causes the address space to become fragmented. Eventually, we will need to grab a 128k block of contiguous VA, but can''t find a contiguous region, despite having plenty of memory (physical or virtual). This is only a problem on 32-bit kernels, because on a 64-bit kernel VA is effectively limitless. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
with ZFS the primary driver isn''t cpu, its "how many drives can one attach" :-) I use a 8 sata and 2 pata port http://supermicro.com/Aplus/motherboard/Opteron/nForce/H8DCE.cfm But there was a v20z I could steal registered ram and cpus from. H8DCE can''t use the SATA HBA Framework which only supports Marvell 88SX and SI3124 controllers, so perhaps a 10 sata and 2 pata (14 drives!) http://www.amdboard.com/abit_sv-1a.html would be a better choice. Rob
Siegfried Nikolaivich wrote:>> But for ZFS, it has been said often that it currently performs >> much better with a 64bit address space, such as that with >> Opterons and other AMD64 CPUs. I think this would play a >> bigger part in a ZFS server performing well than just MHZ >> and cache size. > > I will no doubt be selecting a 64-bit capable CPU. My main concern is whether getting a dual core vs single core processor will give ZFS any noticable performance gain. Is ZFS multi-threaded in any way? I will also be heavily using NFS and possibly Samba, but a single core processor with a much higher clock speed is much cheaper than the dual core offerings from AMD.You''re still constrained by memory speed.> Also, there is a premium price for extra L2 cache. Would the ZFS checksum''ing and parity calculations benefit at all from a larger L2 cache, say 1MB? Or would the instructions fit fine inside 512kB? I know it depends on the application, but some general info on this subject will help my selection.Cache works great for those things which are reused. There is relatively little data reuse in a file system. For ZFS, when you are checksumming or compressing data, there is almost zero reuse. I''d put my money in more cores and RAM rather than higher clock. -- richard
Eric Schrock wrote:> On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote: >> offtopic query : >> How can ZFS require more VM address space but not more VM ? >> > > The real problem is VA fragmentation, not consumption. Over time, ZFS''s > heavy use of the VM system causes the address space to become > fragmented. Eventually, we will need to grab a 128k block of contiguous > VA, but can''t find a contiguous region, despite having plenty of memory > (physical or virtual).Interesting, I saw and helped debug a very similar sounding problem with VxVM and VxFS on an E10k with 15TB of EMC storage and 10,000 NFS shares years ago. This was on Solaris 2.6 so even though it was UltraSPARC CPU there was still only a 32bit address space. Jeff Bonwick supplied the fixes for this, I don''t remember the details but it did help reduce the memory fragmentation. It does make me wonder though if these fixes that were applicable to 32bit SPARC work for 32bit x86. -- Darren J Moffat
On Fri, Jul 07, 2006 at 09:50:47AM +0100, Darren J Moffat wrote:> Eric Schrock wrote: > >On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote: > >> offtopic query : > >> How can ZFS require more VM address space but not more VM ? > >> > > > >The real problem is VA fragmentation, not consumption. Over time, ZFS''s > >heavy use of the VM system causes the address space to become > >fragmented. Eventually, we will need to grab a 128k block of contiguous > >VA, but can''t find a contiguous region, despite having plenty of memory > >(physical or virtual). > > Interesting, I saw and helped debug a very similar sounding problem > with VxVM and VxFS on an E10k with 15TB of EMC storage and 10,000 NFS > shares years ago. This was on Solaris 2.6 so even though it was > UltraSPARC CPU there was still only a 32bit address space. > > Jeff Bonwick supplied the fixes for this, I don''t remember the details > but it did help reduce the memory fragmentation. It does make me > wonder though if these fixes that were applicable to 32bit SPARC work > for 32bit x86.The main difference here is that on UltraSPARC, kernel and user have separate VA spaces (due to the alternate ASIs on SPARC). This means that both user and kernel get the full 32-bit address space to themselves. On x86, kernel and user share the VA. If you look at kernbase, all addresses above that are kernel (typically 1GB), the rest is user. So the situation on x86 is much more dire than it ever was on SPARC, even before 64-bit. --Bill
>Interesting, I saw and helped debug a very similar sounding problem >with VxVM and VxFS on an E10k with 15TB of EMC storage and 10,000 NFS >shares years ago. This was on Solaris 2.6 so even though it was >UltraSPARC CPU there was still only a 32bit address space. > >Jeff Bonwick supplied the fixes for this, I don''t remember the details >but it did help reduce the memory fragmentation. It does make me >wonder though if these fixes that were applicable to 32bit SPARC work >for 32bit x86.Even the 32 bit UltraSPARC kernel had about 4GB of memory/VA available to it; on the 32bit x86 kernel, this is far less; only whatever is above the stack. (UltraSPARC has a different memory map for kernel/userland; this is different from many other CPUs which map the kernel in the same address space as the user processes). Casper
On Fri, 7 Jul 2006, Darren J Moffat wrote:> Eric Schrock wrote: >> On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote: >>> offtopic query : >>> How can ZFS require more VM address space but not more VM ? >>> >> >> The real problem is VA fragmentation, not consumption. Over time, ZFS''s >> heavy use of the VM system causes the address space to become >> fragmented. Eventually, we will need to grab a 128k block of contiguous >> VA, but can''t find a contiguous region, despite having plenty of memory >> (physical or virtual). > > Interesting, I saw and helped debug a very similar sounding problem with > VxVM and VxFS on an E10k with 15TB of EMC storage and 10,000 NFS shares years > ago. This was on Solaris 2.6 so even though it was UltraSPARC CPU there was > still only a 32bit address space. > > Jeff Bonwick supplied the fixes for this, I don''t remember the details but it > did help reduce the memory fragmentation. It does make me wonder though if > these fixes that were applicable to 32bit SPARC work for 32bit x86.Not quite comparable. The work that Jeff did then was the conversion of the old rmalloc-based heap mgmt. to vmem. The problem with the old allocator was that _any_ oversize allocation activity, even if it were a growth request from a kmem cache, lead to heavy heap fragmentation, and the number of fragments in an rmalloc-based mechanism (see rmalloc(9F)) is limited. Vmem scales here, and the quantum caches (which is the part that got backported to 2.6) as an intermediate "band aid" also significantly reduce the number of calls into the heap allocator backend. vmem allows the heap to fragment - and still to function - which is a striking difference to rmalloc. Once the (determined at map creation time) number of slots in a resource map is reached, it doesn''t matter whether there''d be free mem in the heap, you can''t get at it unless you happen to request _exactly_ the size of an existing fragment. Otherwise, you''d need to split a fragment, creating two/three new ones, which you can''t as there is no slot - Fragmentation with the pre-8 rmalloc heap is pathological. It''s not with vmem, vmem allows the heap to work even if heavily fragmented. But if you have a heavy "oversize consumer", the long-term effect of that will be that all vmem arenas larger than the "most frequently used ''big'' size" become empty. ZFS will make all free spans accumulate in the 128kB one under high load. Ok, all that babbling in short: In Solaris 2.6, heap fragmentation was a pathological scaling problem that lead to a system hang sooner or later because of kernelmap exhaustion. The Vmem/quantum cache heap does function even if the heap gets very fragmented - it scales. It doesn''t remove the possibility of the heap to fragment, but it deals with that gracefully. What still is there, though, is the ability of a kernel memory consumer to cause heap fragmentation - vmem can''t solve the issue that if you allocate and free a huge number of N-sized slabs in random ways over time, the heap will in the end contain mostly N-sized fragments. That''s what happens with ZFS. FrankH. =========================================================================No good can come from selling your freedom, not for all gold of the world, for the value of this heavenly gift exceeds that of any fortune on earth. ==========================================================================
Eric Schrock wrote:>On Thu, Jul 06, 2006 at 09:53:32PM +0530, Pramod Batni wrote: > > >> offtopic query : >> How can ZFS require more VM address space but not more VM ? >> >> >> > >The real problem is VA fragmentation, not consumption. Over time, ZFS''s >heavy use of the VM system causes the address space to become >fragmented. Eventually, we will need to grab a 128k block of contiguous >VA, but can''t find a contiguous region, despite having plenty of memory >(physical or virtual). > >This is only a problem on 32-bit kernels, because on a 64-bit kernel VA >is effectively limitless. > > >Is that to say eventially there will be no contiguous VA and the system will become unstable/unresponsive or hang? Or will it clear itself during periods of low activity? I''m hoping to use some old 32-bit hardware (dual p3, 1ghz) with a bunch of drives (6x250 raidz) to provided file storage and some processing in a home network. I''d rather not have to worry about rebooting. I had >500 days uptime on a previous systembefore a long power outage.>- Eric > >-- >Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >